Search Results: "rha"

Review: Furious Heaven, by Kate Elliott

Series:	Sun Chronicles #2
Publisher:	Tor
Copyright:	2023
ISBN:	1-250-86701-0
Format:	Kindle
Pages:	725

Furious Heaven is the middle book of a trilogy and a direct sequel to Unconquerable Sun. Don't start here. I also had some trouble remembering what happened in the previous book (grumble recaps mutter), and there are a lot of threads, so I would try to minimize the time between books unless you have a good memory for plot details. This is installment two of gender-swapped Alexander the Great in space. When we last left Sun and her Companions, Elliott had established the major players in this interstellar balance of power and set off some opening skirmishes, but the real battles were yet to come. Sun was trying to build her reputation and power base while carefully staying on the good side of Queen-Marshal Eirene, her mother and the person credited with saving the Republic of Chaonia from foreign dominance. The best parts of the first book weren't Sun herself but wily Persephone, one of her Companions, whose viewpoint chapters told a more human-level story of finding her place inside a close-knit pre-existing friendship group. Furious Heaven turns that all on its head. The details are spoilers (insofar as a plot closely tracking the life of Alexander the Great can contain spoilers), but the best parts of the second book are the chapters about or around Sun. What I find most impressive about this series so far is Elliott's ability to write Sun as charismatic in a way that I can believe as a reader. That was hit and miss at the start of the series, got better towards the end of Unconquerable Sun, and was wholly effective here. From me, that's high but perhaps unreliable praise; I typically find people others describe as charismatic to be some combination of disturbing, uncomfortable, dangerous, or obviously fake. This is a rare case of intentionally-written fictional charisma that worked for me. Elliott does not do this by toning down Sun's ambition. Sun, even more than her mother, is explicitly trying to gather power and bend the universe (and the people in it) to her will. She treats people as resources, even those she's the closest to, and she's ruthless in pursuit of her goals. But she's also honorable, straightforward, and generous to the people around her. She doesn't lie about her intentions; she follows a strict moral code of her own, keeps her friends' secrets, listens sincerely to their advice, and has the sort of battlefield charisma where she refuses to ask anyone else to take risks she personally wouldn't take. And her use of symbolism and spectacle isn't just superficial; she finds the points of connection between the symbols and her values so that she can sincerely believe in what she's doing. I am fascinated by how Elliott shapes the story around her charisma. Writing an Alexander analogue is difficult; one has to write a tactical genius with the kind of magnetic attraction that enabled him to lead an army across the known world, and make this believable to the reader. Elliott gives Sun good propaganda outlets and makes her astonishingly decisive (and, of course, uses the power of the author to ensure those decisions are good ones), but she also shows how Sun is constantly absorbing information and updating her assumptions to lay the groundwork for those split-second decisions. Sun uses her Companions like a foundation and a recovery platform, leaning on them and relying on them to gather her breath and flesh out her understanding, and then leaping from them towards her next goal. Elliott writes her as thinking just a tiny bit faster than the reader, taking actions I was starting to expect but slightly before I had put together my expectation. It's a subtle but difficult tightrope to walk as the writer, and it was incredibly effective for me. The downside of Furious Heaven is that, despite kicking the action into a much higher gear, this book sprawls. There are five viewpoint characters (Persephone and the Phene Empire character Apama from the first book, plus two new ones), as well as a few interlude chapters from yet more viewpoints. Apama's thread, which felt like a minor subplot of the first book, starts paying off in this book by showing the internal political details of Sun's enemy. That already means the reader has to track two largely separate and important stories. Add on a Persephone side plot about her family and a new plot thread about other political factions and it's a bit too much. Elliott does a good job avoiding reader confusion, but she still loses narrative momentum and reader interest due to the sheer scope. Persephone's thread in particular was a bit disappointing after being the highlight of the previous book. She spends a lot of her emotional energy on tedious and annoying sniping at Jade, which accomplishes little other than making them both seem immature and out of step with the significance of what's going on elsewhere. This is also a middle book of a trilogy, and it shows. It provides a satisfying increase in intensity and gets the true plot of the trilogy well underway, but nothing is resolved and a lot of new questions and plot threads are raised. I had similar problems with Cold Fire, the middle book of the other Kate Elliott trilogy I've read, and this book is 200 pages longer. Elliott loves world-building and huge, complex plots; I have a soft spot for them too, but they mean the story is full of stuff, and it's hard to maintain the same level of reader interest across all the complications and viewpoints. That said, I truly love the world-building. Elliott gives her world historical layers, with multiple levels of lost technology, lost history, and fallen empires, and backs it up with enough set pieces and fragments of invented history that I was enthralled. There are at least five major factions with different histories, cultures, and approaches to technology, and although they all share a history, they interpret that history in fascinatingly different ways. This world feels both lived in and full of important mysteries. Elliott also has a knack for backing the ambitions of her characters with symbolism that defines the shape of that ambition. The title comes from a (translated) verse of an in-universe song called the Hymn of Leaving, which is sung at funerals and is about the flight on generation ships from the now-lost Celestial Empire, the founding myth of this region of space:

Crossing the ocean of stars we leave our home behind us.
We are the spears cast at the furious heaven
And we will burn one by one into ashes
As with the last sparks we vanish.
This memory we carry to our own death which awaits us
And from which none of us will return.
Do not forget. Goodbye forever.

This is not great poetry, but it explains so much about the psychology of the characters. Sun repeatedly describes herself and her allies as spears cast at the furious heaven. Her mother's life mission was to make Chaonia a respected independent power. Hers is much more than that, reaching back into myth for stories of impossible leaps into space, burning brightly against the hostile power of the universe itself. A question about a series like this is why one should want to read about a gender-swapped Alexander the Great in space, rather than just reading about Alexander himself. One good (and sufficient) answer is that both the gender swap and the space parts are inherently interesting. But the other place that Elliott uses the science fiction background is to give Sun motives beyond sheer personal ambition. At a critical moment in the story, just like Alexander, Sun takes a detour to consult an Oracle. Because this is a science fiction novel, it's a great SF set piece involving a mysterious AI. But also because this is a science fiction story, Sun doesn't only ask about her personal ambitions. I won't spoil the exact questions; I think the moment is better not knowing what she'll ask. But they're science fiction questions, reader questions, the kinds of things Elliott has been building curiosity about for a book and a half by the time we reach that scene. Half the fun of reading a good epic space opera is learning the mysteries hidden in the layers of world-building. Aligning the goals of the protagonist with the goals of the reader is a simple storytelling trick, but oh, so effective. Structurally, this is not that great of a book. There's a lot of build-up and only some payoff, and there were several bits I found grating. But I am thoroughly invested in this universe now. The third book can't come soon enough. Followed by Lady Chaos, which is still being written at the time of this review. Rating: 7 out of 10

In my recent post about data archiving to removable media, I laid out the difference between backing up and archiving, and also said I d evaluate git-annex and dar. This post evaluates git-annex. The next will look at dar, and then I ll make a comparison post. What is git-annex? git-annex is a fantastic and versatile program that does well, it s one of those things that can do so much that it s a bit hard to describe. Its homepage says:

git-annex allows managing large files with git, without storing the file contents in git. It can sync, backup, and archive your data, offline and online. Checksums and encryption keep your data safe and secure. Bring the power and distributed nature of git to bear on your large files with git-annex.

I think the particularly interesting features of git-annex aren t actually included in that list. Among the features of git-annex that make it shine for this purpose, its location tracking is key. git-annex can know exactly which device has which file at which version at all times. Combined with its preferred content settings, this lets you very easily say things like:

I want exactly 1 copy of every file to exist within the set #1 of backup drives. Here s a drive in that set; copy to it whatever needs to be copied to satisfy that requirement.
Now I have another set of backup drives. Periodically I will swap sets offsite. Copy whatever is needed to this drive in the second set, making sure that there is 1 copy of every file within this set as well, regardless of what s in the first set.
Here s a directory I want to use to track the status of everything else. I don t want any copies at all here.

git-annex can be set to allow a configurable amount of free space to remain on a device, and it will fill it up with whatever copies are necessary up until it hits that limit. Very convenient! git-annex will store files in a folder structure that mirrors the origin folder structure, in plain files just as they were. This maximizes the ability for a future person to access the content, since it is all viewable without any special tool at all. Of course, for things like optical media, git-annex will essentially be creating what amounts to incrementals. To obtain a consistent copy of the original tree, you would still need to use git-annex to process (export) the archives. git-annex challenges In my prior post, I related some challenges with git-annex. The biggest of them quite poor performance of the directory special remote when dealing with many files has been resolved by Joey, git-annex s author! That dramatically improves the git-annex use scenario here! The fixing commit is in the source tree but not yet in a release. git-annex no doubt may still have performance challenges with repositories in the 100,000+-range, but in that order of magnitude it now looks usable. I m not sure about 1,000,000-file repositories (I haven t tested); there is a page about scalability. A few other more minor challenges remain:

git-annex doesn t really preserve POSIX attributes; for instance, permissions, symlink destinations, and timestamps are all not preserved. Of these, timestamps are the most important for my particular use case.
If your data set to archive contains Git repositories itself, these will not be included.

I worked around the timestamp issue by using the mtree-netbsd package in Debian. mtree writes out a summary of files and metadata in a tree, and can restore them. To save: mtree -c -R nlink,uid,gid,mode -p /PATH/TO/REPO -X <(echo './.git') > /tmp/spec And, after restoration, the timestamps can be applied with: mtree -t -U -e < /tmp/spec Walkthrough: initial setup To use git-annex in this way, we have to do some setup. My general approach is this:

There is a source of data that lives outside git-annex. I'll call this $SOURCEDIR.
I'm going to name the directories holding my data $REPONAME.
There will be a "coordination" git-annex repo. It will hold metadata only, and no data. This will let us track where things live. I'll call it $METAREPO.
There will be drives. For this example, I'll call their mountpoints $DRIVE01 and $DRIVE02. For easy demonstration purposes, I used a ZFS dataset with a refquota set (to observe the size handling), but I could have as easily used a LVM volume, btrfs dataset, loopback filesystem, or USB drive. For optical discs, this would be a staging area or a UDF filesystem.

Let's get started! I've set all these shell variables appropriately for this example, and REPONAME to "testdata". We'll begin by setting up the metadata-only tracking repo.



$ REPONAME=testdata

$ mkdir "$METAREPO"

$ cd "$METAREPO"

$ git init

$ git config annex.thin true

There is a sort of complicated topic of how git-annex stores files in a repo, which varies depending on whether the data for the file is present in a given repo, and whether the file is locked or unlocked. Basically, the options I use here cause git-annex to mostly use hard links instead of symlinks or pointer files, for maximum compatibility with non-POSIX filesystems such as NTFS and UDF, which might be used on these devices. thin is part of that. Let's continue:



$ git annex init 'local hub'

init local hub ok

(recording state in git...)

$ git annex wanted . "include=* and exclude=$REPONAME/*"

wanted . ok

(recording state in git...)

In a bit, we are going to import the source data under the directory named $REPONAME (here, testdata). The wanted command says: in this repository (represented by the bare dot), the files we want are matched by the rule that says eveyrthing except what's under $REPONAME. In other words, we don't want to make an unnecessary copy here. Because I expect to use an mtree file as documented above, and it is not under $REPONAME/, it will be included. Let's just add it and tweak some things.



$ touch mtree

$ git annex add mtree

add mtree

ok

(recording state in git...)

$ git annex sync

git-annex sync will change default behavior to operate on --content in a future version of git-annex. Recommend you explicitly use --no-content (or -g) to prepare for that change. (Or you can configure annex.synccontent)

commit

[main (root-commit) 6044742] git-annex in local hub

 1 file changed, 1 insertion(+)

 create mode 120000 mtree

ok

$ ls -l

total 9

lrwxrwxrwx 1 jgoerzen jgoerzen 178 Jun 15 22:31 mtree -> .git/annex/objects/pX/ZJ/...

OK! We've added a file, and it got transformed into a symlink. That's the thing I said we were going to avoid, so:



 git annex adjust --unlock-present

adjust

Switched to branch 'adjusted/main(unlockpresent)'

ok

$ ls -l

total 1

-rw-r--r-- 2 jgoerzen jgoerzen 0 Jun 15 22:31 mtree

You'll notice it transformed into a hard link (nlinks=2) file. Great! Now let's import the source data. For that, we'll use the directory special remote.



$ git annex initremote source type=directory  directory=$SOURCEDIR importtree=yes \

   encryption=none

initremote source ok

(recording state in git...)

$ git annex enableremote source directory=$SOURCEDIR

enableremote source ok

(recording state in git...)

$ git config remote.source.annex-readonly true

$ git config annex.securehashesonly true

$ git config annex.genmetadata true

$ git config annex.diskreserve 100M

$ git config remote.source.annex-tracking-branch main:$REPONAME

OK, so here we created a new remote named "source". We enabled it, and set some configuration. Most notably, that last line causes files from "source" to be imported under $REPONAME/ as we wanted earlier. Now we're ready to scan the source.



$ git annex sync

At this point, you'll see git-annex computing a hash for every file in the source directory. I can verify with du that my metadata-only repo only uses 14MB of disk space, while my source is around 4GB. Now we can see what git-annex thinks about file locations:



$ git-annex whereis   less

whereis mtree (1 copy)

        8aed01c5-da30-46c0-8357-1e8a94f67ed6 -- local hub [here]

ok

whereis testdata/[redacted] (0 copies)

  The following untrusted locations may also have copies:

        9e48387e-b096-400a-8555-a3caf5b70a64 -- [source]

failed

... many more lines ...

So remember we said we wanted mtree, but nothing under testdata, under this repo? That's exactly what we got. git-annex knows that the files under testdata can be found under the "source" special remote, but aren't in any git-annex repo -- yet. Now we'll start adding them. Walkthrough: removable drives I've set up two 500MB filesystems to represent removable drives. We'll see how git-annex works with them.



$ cd $DRIVE01

$ df -h .

Filesystem                     Size  Used Avail Use% Mounted on

acrypt/no-backup/annexdrive01  500M  1.0M  499M   1% /acrypt/no-backup/annexdrive01

$ git clone $METAREPO

Cloning into 'testdata'...

done.

$ cd $REPONAME

$ git config annex.thin true

$ git annex init "test drive #1"

$ git annex adjust --hide-missing --unlock

adjust

Switched to branch 'adjusted/main(hidemissing-unlocked)'

ok

$ git annex sync

OK, that's the initial setup. Now let's enable the source remote and configure it the same way we did before:



$ git annex enableremote source directory=$SOURCEDIR

enableremote source ok

(recording state in git...)

$ git config remote.source.annex-readonly true

$ git config remote.source.annex-tracking-branch main:$REPONAME

$ git config annex.securehashesonly true

$ git config annex.genmetadata true

$ git config annex.diskreserve 100M

Now, we'll add the drive to a group called "driveset01" and configure what we want on it:



$ git annex group . driveset01

$ git annex wanted . '(not copies=driveset01:1)'

What this does is say: first of all, this drive is in a group named driveset01. Then, this drive wants any files for which there isn't already at least one copy in driveset01. Now let's load up some files!



$ git annex sync --content

As the messages fly by from here, you'll see it mentioning that it got mtree, and then various files from "source" -- until, that is, the filesystem had less than 100MB free, at which point it complained of no space for the rest. Exactly like we wanted! Now, we need to teach $METAREPO about $DRIVE01.



$ cd $METAREPO

$ git remote add drive01 $DRIVE01/$REPONAME

$ git annex sync drive01

git-annex sync will change default behavior to operate on --content in a future version of git-annex. Recommend you explicitly use --no-content (or -g) to prepare for that change. (Or you can configure annex.synccontent)

commit

On branch adjusted/main(unlockpresent)

nothing to commit, working tree clean

ok

merge synced/main (Merging into main...)

Updating d1d9e53..817befc

Fast-forward

(Merging into adjusted branch...)

Updating 7ccc20b..861aa60

Fast-forward

ok

pull drive01

remote: Enumerating objects: 214, done.

remote: Counting objects: 100% (214/214), done.

remote: Compressing objects: 100% (95/95), done.

remote: Total 110 (delta 6), reused 0 (delta 0), pack-reused 0

Receiving objects: 100% (110/110), 13.01 KiB   1.44 MiB/s, done.

Resolving deltas: 100% (6/6), completed with 6 local objects.

From /acrypt/no-backup/annexdrive01/testdata

 * [new branch]      adjusted/main(hidemissing-unlocked) -> drive01/adjusted/main(hidemissing-unlocked)

 * [new branch]      adjusted/main(unlockpresent)        -> drive01/adjusted/main(unlockpresent)

 * [new branch]      git-annex                           -> drive01/git-annex

 * [new branch]      main                                -> drive01/main

 * [new branch]      synced/main                         -> drive01/synced/main

ok

OK! This step is important, because drive01 and drive02 (which we'll set up shortly) won't necessarily be able to reach each other directly, due to not being plugged in simultaneously. Our $METAREPO, however, will know all about where every file is, so that the "wanted" settings can be correctly resolved. Let's see what things look like now:



$ git annex whereis   less

whereis mtree (2 copies)

        8aed01c5-da30-46c0-8357-1e8a94f67ed6 -- local hub [here]

        b46fc85c-c68e-4093-a66e-19dc99a7d5e7 -- test drive #1 [drive01]

ok

whereis testdata/[redacted] (1 copy)

        b46fc85c-c68e-4093-a66e-19dc99a7d5e7 -- test drive #1 [drive01]

  The following untrusted locations may also have copies:

        9e48387e-b096-400a-8555-a3caf5b70a64 -- [source]

ok

If I scroll down a bit, I'll see the files past the 400MB mark that didn't make it onto drive01. Let's add another example drive! Walkthrough: Adding a second drive The steps for $DRIVE02 are the same as we did before, just with drive02 instead of drive01, so I'll omit listing it all a second time. Now look at this excerpt from whereis:



whereis testdata/[redacted] (1 copy)

        b46fc85c-c68e-4093-a66e-19dc99a7d5e7 -- test drive #1 [drive01]


  The following untrusted locations may also have copies:

        9e48387e-b096-400a-8555-a3caf5b70a64 -- [source]

ok

whereis testdata/[redacted] (1 copy)

        c4540343-e3b5-4148-af46-3f612adda506 -- test drive #2 [drive02]

  The following untrusted locations may also have copies:

        9e48387e-b096-400a-8555-a3caf5b70a64 -- [source]

ok

Look at that! Some files on drive01, some on drive02, some neither place. Perfect! Walkthrough: Updates So I've made some changes in the source directory: moved a file, added another, and deleted one. All of these were copied to drive01 above. How do we handle this? First, we update the metadata repo:



$ cd $METAREPO

$ git annex sync

$ git annex dropunused all

OK, this has scanned $SOURCEDIR and noted changes. Let's see what whereis says:



$ git annex whereis   less

...

whereis testdata/cp (0 copies)

  The following untrusted locations may also have copies:

        9e48387e-b096-400a-8555-a3caf5b70a64 -- [source]

failed

whereis testdata/file01-unchanged (1 copy)

        b46fc85c-c68e-4093-a66e-19dc99a7d5e7 -- test drive #1 [drive01]

  The following untrusted locations may also have copies:

        9e48387e-b096-400a-8555-a3caf5b70a64 -- [source]

ok

So this looks right. The file I added was a copy of /bin/cp. I moved another file to one named file01-unchanged. Notice that it realized this was a rename and that the data still exists on drive01. Well, let's update drive01.



$ cd $DRIVE01/$REPONAME

$ git annex sync --content

Looking at the testdata/ directory now, I see that file01-unchanged has been renamed, the deleted file is gone, but cp isn't yet here -- probably due to space issues; as it's new, it's undefined whether it or some other file would fill up free space. Let's work along a few more commands.



$ git annex get --auto

$ git annex drop --auto

$ git annex dropunused all

And now, let's make sure metarepo is updated with its state.



$ cd $METAREPO

$ git annex sync

We could do the same for drive02. This is how we would proceed with every update. Walkthrough: Restoration Now, we have bare files at reasonable locations in drive01 and drive02. But, to generate a consistent restore, we need to be able to actually do an export. Otherwise, we may have files with old names, duplicate files, etc. Let's assume that we lost our source and metadata repos and have to restore from scratch. We'll make a new $RESTOREDIR. We'll begin with drive01 since we used it most recently.



$ mv $METAREPO $METAREPO.disabled

$ mv $SOURCEDIR $SOURCEDIR.disabled

$ git clone $DRIVE01/$REPONAME $RESTOREDIR

$ cd $RESTOREDIR

$ git config annex.thin true

$ git annex init "restore"

$ git annex adjust --hide-missing --unlock

Now, we need to connect the drive01 and pull the files from it.



$ git remote add drive01 $DRIVE01/$REPONAME

$ git annex sync --content

Now, repeat with drive02:



$ git remote add drive02 $DRIVE02/$REPONAME

$ git annex sync --content

Now we've got all our content back! Here's what whereis looks like:



whereis testdata/file01-unchanged (3 copies)

        3d663d0f-1a69-4943-8eb1-f4fe22dc4349 -- restore [here]

        9e48387e-b096-400a-8555-a3caf5b70a64 -- source

        b46fc85c-c68e-4093-a66e-19dc99a7d5e7 -- test drive #1 [origin]

ok

...

I was a little surprised that drive01 didn't seem to know what was on drive02. Perhaps that could have been remedied by adding more remotes there? I'm not entirely sure; I'd thought would have been able to do that automatically. Conclusions I think I have demonstrated two things: First, git-annex is indeed an extremely powerful tool. I have only scratched the surface here. The location tracking is a neat feature, and being able to just access the data as plain files if all else fails is nice for future users. Secondly, it is also a complex tool and difficult to get right for this purpose (I think much easier for some other purposes). For someone that doesn't live and breathe git-annex, it can be hard to get right. In fact, I'm not entirely sure I got it right here. Why didn't drive02 know what files were on drive01 and vice-versa? I don't know, and that reflects some kind of misunderstanding on my part about how metadata is synced; perhaps more care needs to be taken in restore, or done in a different order, than I proposed. I initially tried to do a restore by using git annex export to a directory special remote with exporttree=yes, but I couldn't ever get it to actually do anything, and I don't know why. These two cut against each other. On the one hand, the raw accessibility of the data to someone with no computer skills is unmatched. On the other hand, I'm not certain I have the skill to always prepare the discs properly, or to do a proper consistent restore.

I've moved to having containers be first-class citizens on my home network, so any local machine (laptop, phone,tablet) can communicate directly with them all, but they're not (by default) exposed to the wider Internet. Here's why, and how. After I moved containers from docker to Podman and systemd, it became much more convenient to run web apps on my home server, but the default approach to networking (each container gets an address on a private network between the host server and containers) meant tedious work (maintaining and reconfiguring a HTTP reverse proxy) to make them reachable by other devices. A more attractive arrangement would be if each container received an IP from the range used by my home LAN, and were automatically addressable from any device on it. To make the containers first-class citizens on my home LAN, first I needed to configure a Linux network bridge and attach the host machine's interface to it (I've done that many times before); then define a new Podman network, of type "bridge". podman-network-create (1) serves as reference, but the blog post Exposing Podman containers fully on the network is an easier read (skip past the macvlan bit). I've opted to choose IP addresses for each container by hand. The Podman network is narrowly defined to a range of IPs that are within the subnet that my ISP-provided router uses, but outside the range of IPs that it allocates. When I start up a container by hand for the first time, I choose a free IP from the sub-range by hand and add a line to /etc/avahi/hosts on the parent machine, e.g.

192.168.1.33 octoprint.local

I then start the container specifying that address, e.g.

podman run --rm -d --name octoprint \
        ...
        --network bridge_local --ip 192.168.1.33 \
        octoprint/octoprint

I can now access that container from any device in my house (laptop, phone, tablet...) via octoprint.local. What's next Although it's not a huge burden, it would be nice to not need to statically define the addresses in /etc/avahi/hosts (perhaps via "IPAM"). I've also been looking at WireGuard (which should be the subject of a future blog post) and combining this with that would be worthwhile.

Just a few days back we came to know about the horrific Train Crash that happened in Odisha (Orissa). There are some things that are known and somethings that can be inferred by observance. Sadly, it seems the incident is going to be covered up

. Some of the facts that have not been contested in the public domain are that there were three lines. One loop line on which the Goods Train was standing and there was an up and a down line. So three lines were there. Apparently, the signalling system and the inter-locking system had issues as highlighted by an official about a month back. That letter, thankfully is in the public domain and I have downloaded it as well. It s a letter that goes to 4 pages. The RW is incensed that the letter got leaked and is in public domain. They are blaming everyone and espousing conspiracy theories rather than taking the minister to task. Incidentally, the Minister has three ministries that he currently holds. Ministry of Communication, Ministry of Electronics and Information Technology (MEIT), and Railways Ministry. Each Ministry in itself is important and has revenues of more than 6 lakh crore rupees. How he is able to do justice to all the three ministries is beyond me

The other thing is funds both for safety and relaying of tracks has been either not sanctioned or unutilized. In fact, CAG and the Railway Brass had shared how derailments have increased and unfulfilled vacancies but they were given no importance

In fact, not talking about safety in the recently held Chintan Shivir (brainstorming session) tells you how much the Govt. is serious about safety. In fact, most of the programme was on high speed rail which is a white elephant. I have shared a whitepaper done by RW in the U.S. that tells how high-speed rail doesn t make economic sense. And that is an economy that is 20 times + the Indian Economy. Even the Chinese are stopping with HSR as it doesn t make economic sense. Incidentally, Air Fares again went up 200% yesterday. Somebody shared in the region of 20k + for an Air ticket from their place to Bangalore

Coming back to the story itself. the Goods Train was on the loopline. Some say it was a little bit on the outer, some say otherwise, but it is established that it was on the loopline. This is standard behavior on and around Railway Stations around the world. Whether it was in the Inner or Outer doesn t make much of a difference with what happened next. The first train that collided with the goods train was the 12864 (SMVB-HWH) Yashwantpur Howrah Express and got derailed on to the next track where from the opposite direction 12841 (Shalimar- Bangalore) Coramandel Express was coming. Now they have said that around 300 people have died and that seems to be part of the cover-up. Both the trains are long trains, having between 23 odd coaches each. Even if you have reserved tickets you have 80 odd people in a coach and usually in most of these trains, it is at least double of that. Lot of money goes to TC and then above (Corruption). The Railway fares have gone up enormously but that s a question for perhaps another time

. So at the very least, we could be looking at more than 1000 people having died. The numbers are being under-reported so that nobody has to take responsibility. The Railways itself has told that it is unable to identify 80% of the people who have died. This means that 80% were unreserved ticket holders or a majority of them. There have been disturbing images as how bodies have been flung over on tractors and whatnot to be either buried or cremated without a thought. We are in peak summer season so bodies will start to rot within 24-48 hours

No arrangements made to cool the bodies and take some information and identifying marks or whatever. The whole thing being done in a very callous manner, not giving dignity to even those who have died for no fault of their own. The dissent note also tells that a cover-up is also in the picture. Apparently, India doesn t have nor does it feel to have a need for something like the NTSB that the U.S. used when it hauled both the plane manufacturer (Boeing) and the FAA when the 737 Max went down due to improper data collection and sharing of data with pilots. And with no accountability being fixed to Minister or any of the senior staff, a small junior staff person may be fired. Perhaps the same official that actually told them about the signal failures almost 3 months back

There were and are also some reports that some jugaadu /temporary fixes were applied to signalling and inter-locking just before this incident happened. I do not know nor confirm one way or the other if the above happened. I can however point out that if such a thing happened, then usually a traffic block is announced and all traffic on those lines are stopped. This has been the thing I know for decades. Traveling between Mumbai and Pune multiple times over the years am aware about traffic block. If some repair work was going on and it wasn t able to complete the work within the time-frame then that may well have contributed to the accident. There is also a bit muddying of the waters where it is being said that one of the trains was 4 hours late, which one is conflicting stories. On top of the whole thing, they have put the case to be investigated by CBI and hinting at sabotage. They also tried to paint a religious structure as mosque, later turned out to be a temple. The RW says done by Muslims as it was Friday not taking into account as shared before that most Railway maintenance works are usually done between Friday Monday. This is a practice followed not just in India but world over. There has been also move over a decade to remove wooden sleepers and have concrete sleepers. Unlike the wooden ones they do not expand and contract as much and their life is much more longer than the wooden ones. Funds had been marked (although lower than last few years) but not yet spent. As we know in case of any accident, it is when all the holes in cheese line up it happens. Fukushima is a great example of that, no sea wall even though Japan is no stranger to Tsunamis. External power at the same level as the plant. (10 meters above sea-level), no training for cascading failures scenarios which is what happened. The Days mini-series shares some but not all the faults that happened at Fukushima and the Govt. response to it. There is a difference though, the Japanese Prime Minister resigned on moral grounds. Here, nor the PM, nor the Minister would be resigning on moral grounds or otherwise :(. Zero accountability and that was partly a natural disaster, here it s man-made. In fact, both the Minister and the Prime Minister arrived with their entourages, did a PR blitzkrieg showing how concerned they are. Within 50 hours, the lines were cleared. The part-time Railway Minister shared that he knows the root cause and then few hours later has given the case to CBI. All are saying, wait for the inquiry report. To date, none of the accidents even in this Govt. has produced an investigation report. And even if it did, I am sure it will whitewash as it did in case of Adani as I had shared before in the previous blog post. Incidentally, it is reported that Adani paid off some of its debt, but when questioned as to where they got the money, complete silence on that part :(. As can be seen cover-up after cover-up

FWIW, the Coramandel Express is known as the Migrant train so has a huge number of passengers, the other one which was collided with is known as sick train as huge number of cancer patients use it to travel to Chennai and come back

Demonetization 2.0 Few days back, India announced demonetization 2.0. Surprised, don t be. Apparently, INR 2k/- is being used for corruption and Mr. Modi is unhappy about it. He actually didn t like the INR 2k/- note but was told that it was needed, who told him we are unaware to date. At that time the RBI Governor was Mr. Urjit Patel who didn t say about INR 2k/- he had said that INR 1k/- note redesigned would come in the market. That has yet to happen. What has happened is that just like INR 500/- and INR 1k/- note is concerned, RBI will no longer honor the INR 2k/- note. Obviously, this has made our neighbors angry, namely Nepal, Sri Lanka, Bhutan etc. who do some trading with us. 2 Deccan herald columns share the limelight on it. Apparently, India wants to be the world s currency reserve but doesn t want to play by the rules for everyone else. It was pointed out that both the U.S. and Singapore had retired their currencies but they will honor that promise even today. The Singapore example being a bit closer (as it s in Asia) is perhaps a bit more relevant than the U.S. one. Singapore retired the SGD $10,000 as of 2014 but even in 2022, it remains as legal tender. They also retired the SGD $1,000 in 2020 but still remains legal tender.

So let s have a fictitious example to illustrate what is meant by what Singapore has done. Let s say I go to Singapore, rent a flat, and find a $1000 note in that house somewhere. Both practically and theoretically, I could go down to any of the banks, get the amount transferred to my wallet, bank account etc. and nobody will question. Because they have promised the same. Interestingly, the Singapore Dollar has been pretty resilient against the USD for quite a number of years vis-a-vis other Asian currencies. Most of the INR 2k/- notes were also found and exchanged in Gujarat in just a few days (The PM and HM s state.). I am sure you are looking into the mental gymnastics that the RW indulge in :(. What is sadder that most of the people who try to defend can t make sense one way or the other and start to name-call and get personal as they have nothing else

Disability questions dropped in NHFS-6 Just came to know today that in the upcoming National Family Health Survey-6 disability questions are being dropped. Why is this important. To put it simply, if you don t have numbers, you won t and can t make policies for them. India is one of the worst countries to live if you are disabled. The easiest way to share to draw attention is most Railway platforms are not at level with people. Just as Mick Lynch shares in the UK, the same is pretty much true for India too. Meanwhile in Europe, they do make an effort to be level so even disabled people have some dignity. If your public transport is sorted, then people would want much more and you will be obligated to provide for them as they are citizens. Here, we have had many reports of women being sexually molested when being transferred from platform to coach irrespective of their age or whatnot The main takeaway is if you do not have their voice, you won t make policies for them. They won t go away but you will make life hell for them. One thing to keep in mind that most people assume that most people are disabled from birth. This may or may not be true. For e.g. in the above triple Railways accidents, there are bound to be disabled people or newly disabled people who were healthy before the accident. The most common accident is road accidents, some involving pedestrians and vehicles or both, the easiest is Ministry of Road Transport data that says 4,00,000 people sustained injuries in 2021 alone in road mishaps. And this is in a country where even accidents are highly under-reported, for more than one reason. The biggest reason especially in 2 and 4 wheeler is the increased premium they would have to pay if in an accident, so they usually compromise with the other and pay off the Traffic Inspector. Sadly, I haven t read a new book, although there are a few books I m looking forward to have. People living in India and neighbors please be careful as more heat waves are expected. Till later.

I ve recently been working on porting some of my Python code to rust, both for performance reasons, and because of the strong typing in the language. As a fan of Haskell, I also just really enjoy using the language. Porting any large project to a new language can be a challenge. There is a temptation to do a rewrite from the ground-up in idiomatic rust and using all new fancy features of the language.

Porting in one go However, this is a bit of a trap:

It blocks other work. It can take a long time to finish the rewrite, during which time there is no good place to make other bug fixes/feature changes. If you make the change in the python branch, then you may also have to patch the in-progress rust fork.
No immediate return on investment. While the rewrite is happening, all of the investment in it is sunk costs.
Throughout the process, you can only run the tests for subsystems that have already been ported. It s common to find subtle bugs later in code ported early.
Understanding existing code, porting it and making it idiomatic rust all at the same time takes more time and post-facto debugging.

Iterative porting Instead, we ve found that it works much better to take an iterative approach. One of the hidden gems of rust is the excellent PyO3 crate, which allows creating python bindings for rust code in a way that is several times less verbose and less painful than C or SWIG. Because of rust s strong ownership model, it s also really hard to muck up e.g. reference counts when creating Python bindings for rust code. We port individual functions or classes to rust one at a time, starting with functionality that doesn t have dependencies on other python code and gradually working our way up the call stack. Each subsystem of the code is converted to two matching rust crates: one with a port of the code to pure rust, and one with python bindings for the rust code. Generally multiple python modules end up being a single pair of rust crates. The signature for the pure Rust code follow rust conventions, but the business logic is mostly ported as-is (just in rust syntax) and the signatures of the python bindings match that of the original python code. This then allows running the original python tests to verify that the code still behaves the same way. Changes can also immediately land on the main branch. A subsequent step is usually to refactor the rust code to be more idiomatic - all the while keeping the tests passing. There is also the potential to e.g. switch to using external rust crates (with perhaps subtly different behaviour), or drop functionality altogether. At some point, we will also port the tests from python to rust, and potentially drop the python bindings - once all the caller s have been converted to rust.

Example For example, imagine I have a Python module janitor/mail_filter.py with this function:

def parse_plain_text_body(text):
   lines = text.splitlines()
   for i, line in enumerate(lines):
       if line == 'Reply to this email directly or view it on GitHub:':
           return lines[i + 1].split('#')[0]
       if (line == 'For more details, see:'
               and lines[i + 1].startswith('https://code.launchpad.net/')):
           return lines[i + 1]
       try:
           (field, value) = line.split(':', 1)
       except ValueError:
           continue
       if field.lower() == 'merge request url':
           return value.strip()
   return None

Porting this to rust naively (in a crate I ve called mailfilter ) it might look something like this:

pub fn parse_plain_text_body(text: &str) -> Option<String>  
     let lines: Vec<&str> = text.lines().collect();
     for (i, line) in lines.iter().enumerate()  
         if line == &"Reply to this email directly or view it on GitHub:"  
             return Some(lines[i + 1].split('#').next().unwrap().to_string());
          
         if line == &"For more details, see:"
             && lines[i + 1].starts_with("https://code.launchpad.net/")
          
             return Some(lines[i + 1].to_string());
          
         if let Some((field, value)) = line.split_once(':')  
             if field.to_lowercase() == "merge request url"  
                 return Some(value.trim().to_string());
              
          
      
     None
  

Bindings are created in a crate called mailfilter-py, which looks like this:

use pyo3::prelude::*;
 #[pyfunction]
 fn parse_plain_text_body(text: &str) -> Option<String>  
     janitor_mail_filter::parse_plain_text_body(text)
  
 #[pymodule]
 pub fn _mail_filter(py: Python, m: &PyModule) -> PyResult<()>  
     m.add_function(wrap_pyfunction!(parse_plain_text_body, m)?)?;
     Ok(())
  

The metadata for the crates is what you d expect. mailfilter-py uses PyO3 and depends on mailfilter.

[package]
 name = "mailfilter-py"
 version = "0.0.0"
 authors = ["Jelmer Vernoo  <jelmer@jelmer.uk>"]
 edition = "2018"
 [lib]
 crate-type = ["cdylib"]
 [dependencies]
 janitor-mail-filter =   path = "../mailfilter"  
 pyo3 =   version = ">=0.14", features = ["extension-module"] 

I use python-setuptools-rust to get the python ecosystem to build the python bindings. Here is what setup.py looks like:

#!/usr/bin/python3
from setuptools import setup
from setuptools_rust import RustExtension, Binding
setup(
        rust_extensions=[RustExtension(
        "janitor._mailfilter", "crates/mailfilter-py/Cargo.toml",
        binding=Binding.PyO3)],
)

And of course, setuptools-rust needs to be listed as a setup requirement in pyproject.toml or setup.cfg. After that, we can replace the original python code with a simple import and verify that the tests still run:

from ._mailfilter import parse_plain_text_body

Of course, not all bindings are as simple as this. Iterators in particular are more complicated, as is code that has a loose idea of ownership in python. But I ve found that the time investment is usually well worth the ability to land changes on the development head early and often. I d be curious to hear if people have had success with other approaches to porting Python code to Rust. If you do, please leave a comment.

I m calling time on DNSSEC. Last week, prompted by a change in my DNS hosting setup, I began removing it from the few personal zones I had signed. Then this Monday the .nz ccTLD experienced a multi-day availability incident triggered by the annual DNSSEC key rotation process. This incident broke several of my unsigned zones, which led me to say very unkind things about DNSSEC on Mastodon and now I feel compelled to more completely explain my thinking: For almost all domains and use-cases, the costs and risks of deploying DNSSEC outweigh the benefits it provides. Don t bother signing your zones. The .nz incident, while topical, is not the motivation or the trigger for this conclusion. Had it been a novel incident, it would still have been annoying, but novel incidents are how we learn so I have a small tolerance for them. The problem with DNSSEC is precisely that this incident was not novel, just the latest in a long and growing list. It s a clear pattern. DNSSEC is complex and risky to deploy. Choosing to sign your zone will almost inevitably mean that you will experience lower availability for your domain over time than if you leave it unsigned. Even if you have a team of DNS experts maintaining your zone and DNS infrastructure, the risk of routine operational tasks triggering a loss of availability (unrelated to any attempted attacks that DNSSEC may thwart) is very high - almost guaranteed to occur. Worse, because of the nature of DNS and DNSSEC these incidents will tend to be prolonged and out of your control to remediate in a timely fashion. The only benefit you get in return for accepting this almost certain reduction in availability is trust in the integrity of the DNS data a subset of your users (those who validate DNSSEC) receive. Trusted DNS data that is then used to communicate across an untrusted network layer. An untrusted network layer which you are almost certainly protecting with TLS which provides a more comprehensive and trustworthy set of security guarantees than DNSSEC is capable of, and provides those guarantees to all your users regardless of whether they are validating DNSSEC or not. In summary, in our modern world where TLS is ubiquitous, DNSSEC provides only a thin layer of redundant protection on top of the comprehensive guarantees provided by TLS, but adds significant operational complexity, cost and a high likelihood of lowered availability. In an ideal world, where the deployment cost of DNSSEC and the risk of DNSSEC-induced outages were both low, it would absolutely be desirable to have that redundancy in our layers of protection. In the real world, given the DNSSEC protocol we have today, the choice to avoid its complexity and rely on TLS alone is not at all painful or risky to make as the operator of an online service. In fact, it s the prudent choice that will result in better overall security outcomes for your users. Ignore DNSSEC and invest the time and resources you would have spent deploying it improving your TLS key and certificate management. Ironically, the one use-case where I think a valid counter-argument for this position can be made is TLDs (including ccTLDs such as .nz). Despite its many failings, DNSSEC is an Internet Standard, and as infrastructure providers, TLDs have an obligation to enable its use. Unfortunately this means that everyone has to bear the costs, complexities and availability risks that DNSSEC burdens these operators with. We can t avoid that fact, but we can avoid creating further costs, complexities and risks by choosing not to deploy DNSSEC on the rest of our non-TLD zones.

But DNSSEC will save us from the evil CA ecosystem! Historically, the strongest motivation for DNSSEC has not been the direct security benefits themselves (which as explained above are minimal compared to what TLS provides), but in the new capabilities and use-cases that could be enabled if DNS were able to provide integrity and trusted data to applications. Specifically, the promise of DNS-based Authentication of Named Entities (DANE) is that with DNSSEC we can be free of the X.509 certificate authority ecosystem and along with it the expensive certificate issuance racket and dubious trust properties that have long been its most distinguishing features. Ten years ago this was an extremely compelling proposition with significant potential to improve the Internet. That potential has gone unfulfilled. Instead of maturing as deployments progressed and associated operational experience was gained, DNSSEC has been beset by the discovery of issue after issue. Each of these has necessitated further changes and additions to the protocol, increasing complexity and deployment cost. For many zones, including significant zones like google.com (where I led the attempt to evaluate and deploy DNSSEC in the mid 2010s), it is simply infeasible to deploy the protocol at all, let alone in a reliable and dependable manner. While DNSSEC maturation and deployment has been languishing, the TLS ecosystem has been steadily and impressively improving. Thanks to the efforts of many individuals and companies, although still founded on the use of a set of root certificate authorities, the TLS and CA ecosystem today features transparency, validation and multi-party accountability that comprehensively build trust in the ability to depend and rely upon the security guarantees that TLS provides. When you use TLS today, you benefit from:

Free/cheap issuance from a number of different certificate authorities.

Regular, automated issuance/renewal via the ACME protocol.

Visibility into who has issued certificates for your domain and when through Certificate Transparency logs.

Confidence that certificates issued without certificate transparency (and therefore lacking an SCT) will not be accepted by the leading modern browsers.

The use of modern cryptographic protocols as a baseline, with a plausible and compelling story for how these can be steadily and promptly updated over time.

DNSSEC with DANE can match the TLS ecosystem on the first benefit (up front price) and perhaps makes the second benefit moot, but has no ability to match any of the other transparency and accountability measures that today s TLS ecosystem offers. If your ZSK is stolen, or a parent zone is compromised or coerced, validly signed TLSA records for a forged certificate can be produced and spoofed to users under attack with minimal chances of detection. Finally, in terms of overall trust in the roots of the system, the CA/Browser forum requirements continue to improve the accountability and transparency of TLS certificate authorities, significantly reducing the ability for any single actor (say a nefarious government) to subvert the system. The DNS root has a well established transparent multi-party system for establishing trust in the DNSSEC root itself, but at the TLD level, almost intentionally thanks to the hierarchical nature of DNS, DNSSEC has multiple single points of control (or coercion) which exist outside of any formal system of transparency or accountability. We ve moved from DANE being a potential improvement in security over TLS when it was first proposed, to being a definite regression from what TLS provides today. That s not to say that TLS is perfect, but given where we re at, we ll get a better security return from further investment and improvements in the TLS ecosystem than we will from trying to fix DNSSEC.

But TLS is not ubiquitous for non-HTTP applications The arguments above are most compelling when applied to the web-based HTTP-oriented ecosystem which has driven most of the TLS improvements we ve seen to date. Non-HTTP protocols are lagging in adoption of many of the improvements and best practices TLS has on the web. Some claim this need to provide a solution for non-HTTP, non-web applications provides a motivation to continue pushing DNSSEC deployment. I disagree, I think it provides a motivation to instead double-down on moving those applications to TLS. TLS as the new TCP. The problem is that costs of deploying and operating DNSSEC are largely fixed regardless of how many protocols you are intending to protect with it, and worse, the negative side-effects of DNSSEC deployment can and will easily spill over to affect zones and protocols that don t want or need DNSSEC s protection. To justify continued DNSSEC deployment and operation in this context means using a smaller set of benefits (just for the non-HTTP applications) to justify the already high costs of deploying DNSSEC itself, plus the cost of the risk that DNSSEC poses to the reliability to your websites. I don t see how that equation can ever balance, particularly when you evaluate it against the much lower costs of just turning on TLS for the rest of your non-HTTP protocols instead of deploying DNSSEC. MTA-STS is a worked example of how this can be achieved. If you re still not convinced, consider that even DNS itself is considering moving to TLS (via DoT and DoH) in order to add the confidentiality/privacy attributes the protocol currently lacks. I m not a huge fan of the latency implications of these approaches, but the ongoing discussion shows that clever solutions and mitigations for that may exist. DoT/DoH solve distinct problems from DNSSEC and in principle should be used in combination with it, but in a world where DNS itself is relying on TLS and therefore has eliminated the majority of spoofing and cache poisoning attacks through DoT/DoH deployment the benefit side of the DNSSEC equation gets smaller and smaller still while the costs remain the same.

OK, but better software or more careful operations can reduce DNSSEC s cost Some see the current DNSSEC costs simply as teething problems that will reduce as the software and tooling matures to provide more automation of the risky processes and operational teams learn from their mistakes or opt to simply transfer the risk by outsourcing the management and complexity to larger providers to take care of. I don t find these arguments compelling. We ve already had 15+ years to develop improved software for DNSSEC without success. What s changed that we should expect a better outcome this year or next? Nothing. Even if we did have better software or outsourced operations, the approach is still only hiding the costs behind automation or transferring the risk to another organisation. That may appear to work in the short-term, but eventually when the time comes to upgrade the software, migrate between providers or change registrars the debt will come due and incidents will occur. The problem is the complexity of the protocol itself. No amount of software improvement or outsourcing addresses that. After 15+ years of trying, I think it s worth considering that combining cryptography, caching and distributed consensus, some of the most fundamental and complex computer science problems, into a slow-moving and hard to evolve low-level infrastructure protocol while appropriately balancing security, performance and reliability appears to be beyond our collective ability. That doesn t have to be the end of the world, the improvements achieved in the TLS ecosystem over the same time frame provide a positive counter example - perhaps DNSSEC is simply focusing our attention at the wrong layer of the stack. Ideally secure DNS data would be something we could have, but if the complexity of DNSSEC is the price we have to pay to achieve it, I m out. I would rather opt to remain with the simpler yet insecure DNS protocol and compensate for its short comings at higher transport or application layers where experience shows we are able to more rapidly improve and develop our security capabilities.

Summing up For the vast majority of domains and use-cases there is simply no net benefit to deploying DNSSEC in 2023. I d even go so far as to say that if you ve already signed your zones, you should (carefully) move them back to being unsigned - you ll reduce the complexity of your operating environment and lower your risk of availability loss triggered by DNS. Your users will thank you. The threats that DNSSEC defends against are already amply defended by the now mature and still improving TLS ecosystem at the application layer, and investing in further improvements here carries far more return than deployment of DNSSEC. For TLDs, like .nz whose outage triggered this post, DNSSEC is not going anywhere and investment in mitigating its complexities and risks is an unfortunate burden that must be shouldered. While the full incident report of what went wrong with .nz is not yet available, the interim report already hints at some useful insights. It is important that InternetNZ publishes a full and comprehensive review so that the full set of learnings and improvements this incident can provide can be fully realised by .nz and other TLD operators stuck with the unenviable task of trying to safely operate DNSSEC.

Postscript After taking a few days to draft and edit this post, I ve just stumbled across a presentation from the well respected Geoff Huston at last weeks RIPE86 meeting. I ve only had time to skim the slides (video here) - they don t seem to disagree with my thinking regarding the futility of the current state of DNSSEC, but also contain some interesting ideas for what it might take for DNSSEC to become a compelling proposition. Probably worth a read/watch!

During the weekend of 19-23 May 2023 I attended the Wikimedia hackathon 2023 in Athens, Greece. The event physically reunited folks interested in the more technological aspects of the Wikimedia movement in person for the first time since 2019. The scope of the hacking projects include (but was not limited to) tools, wikipedia bots, gadgets, server and network infrastructure, data and other technical systems. My role in the event was two-fold: on one hand I was in the event because of my role as SRE in the Wikimedia Cloud Services team, where we provided very valuable services to the community, and I was expected to support the technical contributors of the movement that were around. Additionally, and because of that same role, I did some hacking myself too, which was specially augmented given I generally collaborate on a daily basis with some community members that were present in the hacking room. The hackathon had some conference-style track and I ran a session with my coworker Bryan, called Past, Present and Future of Wikimedia Cloud Services (Toolforge and friends) (slides) which was very satisfying to deliver given the friendly space that it was. I attended a bunch of other sessions, and all of them were interesting and well presented. The number of ML themes that were present in the program schedule was exciting. I definitely learned a lot from attending those sessions, from how LLMs work, some fascinating applications for them in the wikimedia space, to what were some industry trends for training and hosting ML models. Session

Despite the sessions, the main purpose of the hackathon was, well, hacking. While I was in the hacking space for more than 12 hours each day, my ability to get things done was greatly reduced by the constant conversations, help requests, and other social interactions with the folks. Don t get me wrong, I embraced that reality with joy, because the social bonding aspect of it is perhaps the main reason why we gathered in person instead of virtually. That being said, this is a rough list of what I did:

Helped review the status of Migrate bldrwnsch from Toolforge GridEngine to Toolforge Kubernetes
Helped the maintainer of the bodh Toolforge tool get it working with a reverse proxy and started conversations about how to facilitate the use case.
Discussed persistent storage options in Toolforge, which resulted in some conversations within the WMCS team.
Had several debates with several folks on what computing abstractions we are providing, including Toolforge as a PaaS, raw Kubernetes access, or even if we should continue offering virtual machines to the community.
Patched toolforge-weld to support some stuff required by jobs-framework-cli. Also made a release of it.
Started a patch for jobs-framework-cli to use toolforge-weld. Which is still unfinished as of this writing.
Debated about how to host ML models, what to do with GPUs etc.
Suggested fellow SRE Riccardo in working on cloud-private subnet: introduce new domain to integrate with Netbox.
Uncovered a flavor definition problem in our Openstack deployment.
Reviewed many Toolforge account requests (most of them not related to the hackathon though), some quota requests and similar things.

The hackathon was also the final days of Technical Engagement as an umbrella group for WMCS and Developer Advocacy teams within the Technology department of the Wikimedia Foundation because of an internal reorg.. We used the chance to reflect on the pleasant time we have had together since 2019 and take a final picture of the few of us that were in person in the event. Technical Engagement

It wasn t the first Wikimedia Hackathon for me, and I felt the same as in previous iterations: it was a welcoming space, and I was surrounded by friends and nice human beings. I ended the event with a profound feeling of being privileged, because I was part of the Wikimedia movement, and because I was invited to participate in it.

I've been watching the neovim community for a while and what seems like a cambrian explosion of plugins emerging. A few weeks back I decided to spend most of a "day of learning" on investigating some of the plugins and technologies that I'd read about: Language Server Protocol, TreeSitter, neorg (a grandiose organiser plugin), etc. It didn't go so well. I spent most of my time fighting version incompatibilities or tracing through scant documentation or code to figure out what plugin was incompatible with which other. There's definitely a line where crossing it is spending too much time playing with your tools instead of creating. On the other hand, there's definitely value in honing your tools and learning about new technologies. Everyone's line is probably in a different place. I've come to the conclusion that I don't have the time or inclination (or both) to approach exploring the neovim universe in this way. There exist a number of plugin "distributions" (such as LunarVim): collections of pre- configured and integrated plugins that you can try to use out-of-the-box. Next time I think I'll pick one up and give that a try &emdash independently from my existing configuration &emdash and see which ideas from it I might like to adopt. shared vimrcs Some folks upload their vim or neovim configurations in their entirety for others to see. I noticed Jess Frazelle had published hers so I took a look. I suppose one could evaluate a bunch of plugins and configuration in isolation using a shared vimrc like this, in the same was as a distribution. bufferline Amongst the plugins she uses was bufferline, a plugin to re-work neovim's tab bar to behave like tab bars from more conventional editors¹. I don't make use of neovim's tabs at all², so I would lose nothing having the (presently hidden) tab bar reworked, so I thought I'd give it a go. I had to disable an existing plugin lightline, which I've had enabled for years but I wasn't sure I was getting much value from. Apparently it also messes with the tab bar! Disabling it, at least for now, at least means I'll find out if I miss it. I am already using vim-buffergator as a means of seeing and managing open buffers: a hotkey opens a sidebar with a list of open buffers, to switch between or close. Bufferline gives me a more immediate, always-present view of open buffers, which is faintly useful: but not much. Perhaps I'd like it more if I was coming from an editor that had made it more of an expected feature. The two things I noticed about it that aren't especially useful for me: when browsing around vimwiki pages, I quickly open a lot of buffers. The horizontal line fills up very quickly. Even when I don't, I habitually have quite a lot of buffers open, and the horizontal line is quickly overwhelmed. I have found myself closing open buffers with the mouse, which I didn't do before. vert Since I have brought up a neovim UI feature (tabs) I thought I'd briefly mention my new favourite neovim built-in command: vert. Quite a few plugins and commands open up a new window (e.g. git-fugitive, Man, etc.) and they typically do so in a horizontal split. I'm increasingly preferring vertical splits. Prefixing any³ such command with vert forces the split to be vertical instead.

in this case the direct influence was apparently DOOM Emacs
(neo)vim's notion of tabs is completely different to what you might expect from other UI models.
at least, I haven't found one that doesn't work yet

Review: Tsalmoth, by Steven Brust

Series:	Vlad Taltos #16
Publisher:	Tor
Copyright:	2023
ISBN:	1-4668-8970-5
Format:	Kindle
Pages:	277

Tsalmoth is the sixteenth book in the Vlad Taltos series and (some fans of the series groan) yet another flashback novel to earlier in Vlad's life. It takes place between Yendi and the interludes in Dragon (or, perhaps more straightforwardly, between Yendi and Jhereg. Most of the books of this series stand alone to some extent, so you could read this book out of order and probably not be horribly confused, but I suspect it would also feel weirdly pointless outside of the context of the larger series. We're back to Vlad running a fairly small operation as a Jhereg, who are the Dragaeran version of organized crime. A Tsalmoth who owes Vlad eight hundred imperials has rudely gotten himself murdered, thoroughly enough that he can't be revived. That's a considerable amount of money, and Vlad would like it back, so he starts poking around. As you might expect if you've read any other book in this series, things then get a bit complicated. This time, they involve Jhereg politics, Tsalmoth house politics, and necromancy (which in this universe is more about dimensional travel than it is about resurrecting the dead). The main story is... fine. Kragar is around being unnoticeable as always, Vlad is being cocky and stubborn and bantering with everyone, and what appears to be a straightforward illegal business relationship turns out to involve Dragaeran magic and thus Vlad's highly-placed friends. As usual, they're intellectually curious about the magic and largely ambivalent to the rest of Vlad's endeavors. The most enjoyable part of the story is Vlad's insistence on getting his money back while everyone else in the story cannot believe he would be this persistent over eight hundred imperials and is certain he has some other motive. It's otherwise a fairly forgettable little adventure. The implications for the broader series, though, are significant, although essentially none of the payoff is here. Brust has been keeping a major secret about Vlad that's finally revealed here, one that has little impact on the plot of this book (although it causes Vlad a lot of angst) but which I suspect will become very important later in the series. That was intriguing but rather unsatisfying, since it stays only a future hook with an attached justification for why we're only finding out about it now. If one has read the rest of the series, it's also nice to see Vlad and Cawti working together, bantering with each other and playing off of each other's strengths. It's reminiscent of the best parts of Yendi. As with many of the books of this series, the chapter introductions tell a parallel story; this time, it's Vlad and Cawti's wedding. I think previous books already mentioned that Vlad is narrating this series into some sort of recording device, and a bit about why he's doing that, but this is made quite explicit here. We get as much of the surrounding frame as we've ever seen before. There are no obvious plot consequences from this it's still all hints and guesswork but I suspect this will also become important by the end of the series. If you've read this much of the series, you'll obviously want to read this one as well, but unfortunately don't get your hopes up for significant plot advancement. This is another station-keeping book, which is a bit of a disappointment. We haven't gotten major plot advancement since Hawk in 2014, and I'm getting impatient. Thankfully, Lyorn has a release date already (April 9, 2024), and assuming all goes according to the grand plan, there are only two books left after Lyorn (Chreotha and The Last Contract). I'm getting hopeful that we're going to get to see the entire series. Meanwhile, I am very tempted to do a complete re-read of the series to date, probably in series chronological order rather than in publication order (as much as that's possible given the fractured timelines of Dragon and Tiassa) so that I can see how the pieces fit together. The constant jumping back and forth and allusions to events that have already happened but that we haven't seen yet is hard to keep track of. I'm very glad the Lyorn Records exists. Followed by Lyorn. Rating: 7 out of 10

These instructions are for qotom devices Q515P and Q1075GE. You can order one from Amazon or directly from Cherry Ni <export03@qotom.com>. Instructions are for those coming from Windows. Prerequisites:

A USB keyboard and mouse
A powered HDMI monitor and an HDMI cable
A copy of the Proxmox VE Installer ISO
A USB disk from which to boot the installer
Software and instructions to burn the raw image to USB
The details of your wireless network including wireless network ID (SSID), WPA password, gateway address and network prefix length

To find your windows network details, run the following command at the command prompt:

netsh interface ip show addresses

Here s my output:

PS C:\Users\cjcol> netsh interface ip show addresses "Wi-Fi"
Configuration for interface "Wi-Fi"
    DHCP enabled:                         Yes
    IP Address:                           172.16.79.53
    Subnet Prefix:                        172.16.79.0/24 (mask 255.255.255.0)
    Default Gateway:                      172.16.79.1
    Gateway Metric:                       0
    InterfaceMetric:                      50

Did you follow the instructions linked above in the prerequisites section? If not, take a moment to do so now.
Open Rufus and select the proxmox iso which you downloaded.

You may be warned that Rufus will be acting as dd.

Don t forget to select the USB drive that you want to write the image to. In my example, the device is creatively called NO_LABEL .

You may be warned that re-imaging the USB disk will result in the previous data on the USB disk being lost.

Once the process is complete, the application will indicate that it is complete.

You should now have a USB disk with the Proxmox installer image on it. Place the USB disk into one of the blue, USB-3.0, USB-A slots on the Qotom device so that the system can read the installer image from it at full speed. The Proxmox installer requires a keyboard, video and mouse. Please attach these to the device along with inserting the USB disk you just created.

Press the power button on the Qotom device. Press the F11 key repeatedly until you see the AMI BIOS menu. Press F11 a couple more times. You ll be presented with a boot menu. One of the options will launch the Proxmox installer. By trial and error, I found that the correct boot menu option was UEFI OS

Once you select the correct option, you will be presented with a menu that looks like this. Select the default option and install.

During the install, you will be presented with an option of the block device to install to. I think there s only a single block device in this celeron, but if there are more than one, I prefer the smaller one for the ProxMox OS. I also make a point to limit the size of the root filesystem to 16G. I think it will take up the entire volume group if you don t set a limit. Okay, I ll do another install and select the correct filesystem.

If you read this far and want me to add some more screenshots and better instructions, leave a comment.

This post serves as a report from my attendance to Kubecon and CloudNativeCon 2023 Europe that took place in Amsterdam in April 2023. It was my second time physically attending this conference, the first one was in Austin, Texas (USA) in 2017. I also attended once in a virtual fashion. The content here is mostly generated for the sake of my own recollection and learnings, and is written from the notes I took during the event. The very first session was the opening keynote, which reunited the whole crowd to bootstrap the event and share the excitement about the days ahead. Some astonishing numbers were announced: there were more than 10.000 people attending, and apparently it could confidently be said that it was the largest open source technology conference taking place in Europe in recent times. It was also communicated that the next couple iteration of the event will be run in China in September 2023 and Paris in March 2024. More numbers, the CNCF was hosting about 159 projects, involving 1300 maintainers and about 200.000 contributors. The cloud-native community is ever-increasing, and there seems to be a strong trend in the industry for cloud-native technology adoption and all-things related to PaaS and IaaS. The event program had different tracks, and in each one there was an interesting mix of low-level and higher level talks for a variety of audience. On many occasions I found that reading the talk title alone was not enough to know in advance if a talk was a 101 kind of thing or for experienced engineers. But unlike in previous editions, I didn t have the feeling that the purpose of the conference was to try selling me anything. Obviously, speakers would make sure to mention, or highlight in a subtle way, the involvement of a given company in a given solution or piece of the ecosystem. But it was non-invasive and fair enough for me. On a different note, I found the breakout rooms to be often small. I think there were only a couple of rooms that could accommodate more than 500 people, which is a fairly small allowance for 10k attendees. I realized with frustration that the more interesting talks were immediately fully booked, with people waiting in line some 45 minutes before the session time. Because of this, I missed a few important sessions that I ll hopefully watch online later. Finally, on a more technical side, I ve learned many things, that instead of grouping by session I ll group by topic, given how some subjects were mentioned in several talks. On gitops and CI/CD pipelines Most of the mentions went to FluxCD and ArgoCD. At that point there were no doubts that gitops was a mature approach and both flux and argoCD could do an excellent job. ArgoCD seemed a bit more over-engineered to be a more general purpose CD pipeline, and flux felt a bit more tailored for simpler gitops setups. I discovered that both have nice web user interfaces that I wasn t previously familiar with. However, in two different talks I got the impression that the initial setup of them was simple, but migrating your current workflow to gitops could result in a bumpy ride. This is, the challenge is not deploying flux/argo itself, but moving everything into a state that both humans and flux/argo can understand. I also saw some curious mentions to the config drifts that can happen in some cases, even if the goal of gitops is precisely for that to never happen. Such mentions were usually accompanied by some hints on how to operate the situation by hand. Worth mentioning, I missed any practical information about one of the key pieces to this whole gitops story: building container images. Most of the showcased scenarios were using pre-built container images, so in that sense they were simple. Building and pushing to an image registry is one of the two key points we would need to solve in Toolforge Kubernetes if adopting gitops. In general, even if gitops were already in our radar for Toolforge Kubernetes, I think it climbed a few steps in my priority list after the conference. Another learning was this site: https://opengitops.dev/. Group

On etcd, performance and resource management I attended a talk focused on etcd performance tuning that was very encouraging. They were basically talking about the exact same problems we have had in Toolforge Kubernetes, like api-server and etcd failure modes, and how sensitive etcd is to disk latency, IO pressure and network throughput. Even though Toolforge Kubernetes scale is small compared to other Kubernetes deployments out there, I found it very interesting to see other s approaches to the same set of challenges. I learned how most Kubernetes components and apps can overload the api-server. Because even the api-server talks to itself. Simple things like kubectl may have a completely different impact on the API depending on usage, for example when listing the whole list of objects (very expensive) vs a single object. The conclusion was to try avoiding hitting the api-server with LIST calls, and use ResourceVersion which avoids full-dumps from etcd (which, by the way, is the default when using bare kubectl get calls). I already knew some of this, and for example the jobs-framework-emailer was already making use of this ResourceVersion functionality. There have been a lot of improvements in the performance side of Kubernetes in recent times, or more specifically, in how resources are managed and used by the system. I saw a review of resource management from the perspective of the container runtime and kubelet, and plans to support fancy things like topology-aware scheduling decisions and dynamic resource claims (changing the pod resource claims without re-defining/re-starting the pods). On cluster management, bootstrapping and multi-tenancy I attended a couple of talks that mentioned kubeadm, and one in particular was from the maintainers themselves. This was of interest to me because as of today we use it for Toolforge. They shared all the latest developments and improvements, and the plans and roadmap for the future, with a special mention to something they called kubeadm operator , apparently capable of auto-upgrading the cluster, auto-renewing certificates and such. I also saw a comparison between the different cluster bootstrappers, which to me confirmed that kubeadm was the best, from the point of view of being a well established and well-known workflow, plus having a very active contributor base. The kubeadm developers invited the audience to submit feature requests, so I did. The different talks confirmed that the basic unit for multi-tenancy in kubernetes is the namespace. Any serious multi-tenant usage should leverage this. There were some ongoing conversations, in official sessions and in the hallway, about the right tool to implement K8s-whitin-K8s, and vcluster was mentioned enough times for me to be convinced it was the right candidate. This was despite of my impression that multiclusters / multicloud are regarded as hard topics in the general community. I definitely would like to play with it sometime down the road. On networking I attended a couple of basic sessions that served really well to understand how Kubernetes instrumented the network to achieve its goal. The conference program had sessions to cover topics ranging from network debugging recommendations, CNI implementations, to IPv6 support. Also, one of the keynote sessions had a reference to how kube-proxy is not able to perform NAT for SIP connections, which is interesting because I believe Netfilter Conntrack could do it if properly configured. One of the conclusions on the CNI front was that Calico has a massive community adoption (in Netfilter mode), which is reassuring, especially considering it is the one we use for Toolforge Kubernetes. Slide

On jobs I attended a couple of talks that were related to HPC/grid-like usages of Kubernetes. I was truly impressed by some folks out there who were using Kubernetes Jobs on massive scales, such as to train machine learning models and other fancy AI projects. It is acknowledged in the community that the early implementation of things like Jobs and CronJobs had some limitations that are now gone, or at least greatly improved. Some new functionalities have been added as well. Indexed Jobs, for example, enables each Job to have a number (index) and process a chunk of a larger batch of data based on that index. It would allow for full grid-like features like sequential (or again, indexed) processing, coordination between Job and more graceful Job restarts. My first reaction was: Is that something we would like to enable in Toolforge Jobs Framework? On policy and security A surprisingly good amount of sessions covered interesting topics related to policy and security. It was nice to learn two realities:

kubernetes is capable of doing pretty much anything security-wise and create greatly secured environments.
it does not by default. The defaults are not security-strict on purpose.

It kind of made sense to me: Kubernetes was used for a wide range of use cases, and developers didn t know beforehand to which particular setup they should accommodate the default security levels. One session in particular covered the most basic security features that should be enabled for any Kubernetes system that would get exposed to random end users. In my opinion, the Toolforge Kubernetes setup was already doing a good job in that regard. To my joy, some sessions referred to the Pod Security Admission mechanism, which is one of the key security features we re about to adopt (when migrating away from Pod Security Policy). I also learned a bit more about Secret resources, their current implementation and how to leverage a combo of CSI and RBAC for a more secure usage of external secrets. Finally, one of the major takeaways from the conference was learning about kyverno and kubeaudit. I was previously aware of the OPA Gatekeeper. From the several demos I saw, it was to me that kyverno should help us make Toolforge Kubernetes more sustainable by replacing all of our custom admission controllers with it. I already opened a ticket to track this idea, which I ll be proposing to my team soon. Final notes In general, I believe I learned many things, and perhaps even more importantly I re-learned some stuff I had forgotten because of lack of daily exposure. I m really happy that the cloud native way of thinking was reinforced in me, which I still need because most of my muscle memory to approach systems architecture and engineering is from the old pre-cloud days. List of sessions I attended on the first day:

List of sessions I attended on the second day:

List of sessions I attended on third day:

The videos have been published on Youtube.

PHILIPS PHL 221S8L Those who have been reading this blog for a long time would perhaps know that I had bought a Viewsonic 19 almost 12 years ago. The Monitor was functioning well till last week. I had thought to change it to a 24 monitor almost 3-4 years ago when 24 LCD Monitors were going for around 4k/- or thereabouts. But the monitor kept on functioning and I didn t have space (nor do) to have a dual-monitor setup. It just didn t make sense. Apart from higher electricity charges it would have also have made more demands on my old system which somehow is still functioning even after all those years. Then last week, it started to dim and after couple of days completely conked out. I had wanted to buy a new monitor in front of mum so she could watch movies or whatever but this was not to be. Sp I had to buy an LCD Monitor as Government raised taxes enormously after pandemic. Same/similar monitor that used to cost INR 4k/- today costed me almost INR 7k/- almost double the price. Hooking it to Debian I got the following

`$ sudo hwinfo --monitor grep Model Model: "PHILIPS PHL 221S8L"` FWIW hwinfo is the latest version
`~$ sudo hwinfo --version21.82` I did see couple of movies before starting to write this blog post. Not an exceptional monitor but better than before. I had option from three brands, Dell (most expensive), Philips (middle) and & LG (lowest in prices). Interestingly, Viewsonic disappeared from the market about 5 years back and made a comeback just couple of years ago. Even Philips which had exited the PC Monitor almost a decade back re-entered the market. Apart from the branding, it doesn t make much of a difference as almost all the products including the above monitors are produced in China. I did remember her a lot while buying the monitor as I m sure she would have enjoyed it far more than me but that was not to be

1984 During last week when I didn t have the monitor I re-read 1984. To be completely honest, I had read the above book when I was in the 20 s and I had no context. The protagonist seemed like a whiner and for the life of me I couldn t understand why he didn t try to escape. Re-reading after almost 2 decades and a bit more I shat a number of times because now the context is pretty near and pretty real. I can see why the Republicans in the U.S. banned it. I also realized why the protagnist didn t attempt to run away because wherever he would run away it would be the same thing. It probably is one of the most depressing books I have ever read. To willfully accept what is false after all that torture. What was also interesting to me is to find that George Orwell was also a soldier just like Tolkien was. Both took part and wrote such different stories. While Mr. Tolkien writes and shares the pendulum between hope and despair, Mr. Orwell is decidedly dark. Not grey but dark. I am not sure if I would like to read Animal Farm anytime soon.

The Reaper Man Terry Pratchett It is by sheer coincidence or perhaps I needed something to fill me up when I got The Reaper Man from Terry Pratchett. It was practically like a breath of fresh air. And I love Mr. Pratchett for the inclusivity he brings in. We think about skin color, and what not and here Mr. Pratchett writes about an undead gentleman who s extremely polite as he was a wizard. I won t talk more as I don t really want to spoil the surprise but rest assured everybody is gonna love it. I also read Long Utopia but this is for those who believe and think of multiverses long before it became a buzzword that it is today.

The Firm John Grisham Now I don t know what I should write about this book as there aren t many John Grisham books where a rookie wins against more than one party opposite him. I wouldn t go much into depth but simply say it was worth a read. I am currently reading Gray Mountain. It very much shows how the coal Industry is corrupt and what all it does. It also brings to mind the amount of mining that is done in which Iron is the mostly sought after and done. Now if we are mining 94% Iron then wouldn t it make sense to ask to have a circular economy around Iron but we don t even hear a word about it. Even with all the imagined projections of lithium mining, it would hardly be 10% . I could go on but will finish for now, till later.

Probably everyone is familiar with a regular VPN. The traditional use case is to connect to a corporate or home network from a remote location, and access services as if you were there. But these days, the notion of corporate network and home network are less based around physical location. For instance, a company may have no particular office at all, may have a number of offices plus a number of people working remotely, and so forth. A home network might have, say, a PVR and file server, while highly portable devices such as laptops, tablets, and phones may want to talk to each other regardless of location. For instance, a family member might be traveling with a laptop, another at a coffee shop, and those two devices might want to communicate, in addition to talking to the devices at home. And, in both scenarios, there might be questions about giving limited access to friends. Perhaps you d like to give a friend access to part of your file server, or as a company, you might have contractors working on a limited project. Pretty soon you wind up with a mess of VPNs, forwarded ports, and tricks to make it all work. With the increasing prevalence of CGNAT, a lot of times you can t even open a port to the public Internet. Each application or device probably has its own gateway just to make it visible on the Internet, some of which you pay for. Then you add on the question of: should you really trust your LAN anyhow? With possibilities of guests using it, rogue access points, etc., the answer is probably no . We can move the responsibility for dealing with NAT, fluctuating IPs, encryption, and authentication, from the application layer further down into the network stack. We then arrive at a much simpler picture for all. So this page is fundamentally about making the network work, simply and effectively.

How do we make the Internet work in these scenarios? We re going to combine three concepts:

A VPN, providing fully encrypted and authenticated communication and stable IPs

Mesh Networking, in which devices automatically discover optimal paths to reach each other

Zero-trust networking, in which we do not need to trust anything about the underlying LAN, because all our traffic uses the secure systems in points 1 and 2.

By combining these concepts, we arrive at some nice results:

You can `ssh hostname`, where hostname is one of your machines (server, laptop, whatever), and as long as `hostname` is up, you can reach it, wherever it is, wherever you are.

Combined with mosh, these sessions will be durable even across moving to other host networks.

You could just as well use telnet, because the underlying network should be secure.

You don t have to mess with encryption keys, certs, etc., for every internal-only service. Since IPs are now trustworthy, that s all you need. hosts.allow could make a comeback!

You have a way of transiting out of extremely restrictive networks. Every tool discussed here has a way of falling back on routing things via a broker (relay) on TCP port 443 if all else fails.

There might sometimes be tradeoffs. For instance:

On LANs faster than 1Gbps, performance may degrade due to encryption and encapsulation overhead. However, these tools should let hosts discover the locality of each other and not send traffic over the Internet if the devices are local.

With some of these tools, hosts local to each other (on the same LAN) may be unable to find each other if they can t reach the control plane over the Internet (Internet is down or provider is down)

Some other features that some of the tools provide include:

Easy sharing of limited access with friends/guests

Taking care of everything you need, including SSL certs, for exposing a certain on-net service to the public Internet

Optional routing of your outbound Internet traffic via an exit node on your network. Useful, for instance, if your local network is blocking tons of stuff.

Let s dive in.

Types of Mesh VPNs I ll go over several types of meshes in this article:

Fully decentralized with automatic hop routing This model has no special central control plane. Nodes discover each other in various ways, and establish routes to each other. These routes can be direct connections over the Internet, or via other nodes. This approach offers the greatest resilience. Examples I ll cover include Yggdrasil and tinc.

Automatic peer-to-peer with centralized control In this model, nodes, by default, communicate by establishing direct links between them. A regular node never carries traffic on behalf of other nodes. Special-purpose relays are used to handle cases in which NAT traversal is impossible. This approach tends to offer simple setup. Examples I ll cover include Tailscale, Zerotier, Nebula, and Netmaker.

Roll your own and hybrid approaches This is a grab bag of other ideas; for instance, running Yggdrasil over Tailscale.

Terminology For the sake of consistency, I m going to use common language to discuss things that have different terms in different ecosystems:

Every tool discussed here has a way of dealing with NAT traversal. It may assist with establishing direct connections (eg, STUN), and if that fails, it may simply relay traffic between nodes. I ll call such a relay a broker . This may or may not be the same system that is a control plane for a tool.

All of these systems operate over lower layers that are unencrypted. Those lower layers may be a LAN (wired or wireless, which may or may not have Internet access), or the public Internet (IPv4 and/or IPv6). I m going to call the unencrypted lower layer, whatever it is, the clearnet .

Evaluation Criteria Here are the things I want to see from a solution:

Secure, with all communications end-to-end encrypted and authenticated, and prevention of traffic from untrusted devices.

Flexible, adapting to changes in network topology quickly and automatically.

Resilient, without single points of failure, and with devices local to each other able to communicate even if cut off from the Internet or other parts of the network.

Private, minimizing leakage of information or metadata about me and my systems

Able to traverse CGNAT without having to use a broker whenever possible

A lesser requirement for me, but still a nice to have, is the ability to include others via something like Internet publishing or inviting guests.

Fully or nearly fully Open Source

Free or very cheap for personal use

Wide operating system support, including headless Linux on x86_64 and ARM.

Fully Decentralized VPNs with Automatic Hop Routing Two systems fit this description: Yggdrasil and Tinc. Let s dive in.

Yggdrasil I ll start with Yggdrasil because I ve written so much about it already. It featured in prior posts such as:

Make the Internet Yours Again With an Instant Mesh Network, which described the tyranny of IP rigidity and using Yggdrasil as a global mesh overlay.

Using Yggdrasil As an Automatic Mesh Fabric to Connect All Your Docker Containers, VMs, and Servers is, in a significant sense, a more specific implementation of the ideas contained here; it s a private Yggdrasil mesh providing the communications layer for dispersed Docker containers.

Recovering Our Lost Free Will Online: Tools and Techniques That Are Available Now features Yggdrasil.

Yggdrasil can be a private mesh VPN, or something more Yggdrasil can be a private mesh VPN, just like the other tools covered here. It s unique, however, in that a key goal of the project is to also make it useful as a planet-scale global mesh network. As such, Yggdrasil is a testbed of new ideas in distributed routing designed to scale up to massive sizes and all sorts of connection conditions. As of 2023-04-10, the main global Yggdrasil mesh has over 5000 nodes in it. You can choose whether or not to participate. Every node in a Yggdrasil mesh has a public/private keypair. Each node then has an IPv6 address (in a private address space) derived from its public key. Using these IPv6 addresses, you can communicate right away. Yggdrasil differs from most of the other tools here in that it does not necessarily seek to establish a direct link on the clearnet between, say, host A and host G for them to communicate. It will prefer such a direct link if it exists, but it is perfectly happy if it doesn t. The reason is that every Yggdrasil node is also a router in the Yggdrasil mesh. Let s sit with that concept for a moment. Consider:

If you have a bunch of machines on your LAN, but only one of them can peer over the clearnet, that s fine; all the other machines will discover this route to the world and use it when necessary.

All you need to run a broker is just a regular node with a public IP address. If you are participating in the global mesh, you can use one (or more) of the free public peers for this purpose.

It is not necessary for every node to know about the clearnet IP address of every other node (improving privacy). In fact, it s not even necessary for every node to know about the existence of all the other nodes, so long as it can find a route to a given node when it s asked to.

Yggdrasil can find one or more routes between nodes, and it can use this knowledge of multiple routes to aggressively optimize for varying network conditions, including combinations of, say, downloads and low-latency ssh sessions.

Behind the scenes, Yggdrasil calculates optimal routes between nodes as necessary, using a mesh-wide DHT for initial contact and then deriving more optimal paths. (You can also read more details about the routing algorithm.) One final way that Yggdrasil is different from most of the other tools is that there is no separate control server. No node is special , in charge, the sole keeper of metadata, or anything like that. The entire system is completely distributed and auto-assembling.

Meeting neighbors There are two ways that Yggdrasil knows about peers:

By broadcast discovery on the local LAN

By listening on a specific port (or being told to connect to a specific host/port)

Sometimes this might lead to multiple ways to connect to a node; Yggdrasil prefers the connection auto-discovered by broadcast first, then the lowest-latency of the defined path. In other words, when your laptops are in the same room as each other on your local LAN, your packets will flow directly between them without traversing the Internet.

Unique uses Yggdrasil is uniquely suited to network-challenged situations. As an example, in a post-disaster situation, Internet access may be unavailable or flaky, yet there may be many local devices perhaps ones that had never known of each other before that could share information. Yggdrasil meets this situation perfectly. The combination of broadcast auto-detection, distributed routing, and so forth, basically means that if there is any physical path between two nodes, Yggdrasil will find and enable it. Ad-hoc wifi is rarely used because it is a real pain. Yggdrasil actually makes it useful! Its broadcast discovery doesn t require any IP address provisioned on the interface at all (it just uses the IPv6 link-local address), so you don t need to figure out a DHCP server or some such. And, Yggdrasil will tend to perform routing along the contours of the RF path. So you could have a laptop in the middle of a long distance relaying communications from people farther out, because it could see both. Or even a chain of such things.

Yggdrasil: Security and Privacy Yggdrasil s mesh is aggressively greedy. It will peer with any node it can find (unless told otherwise) and will find a route to anywhere it can. There are two main ways to make sure you keep unauthorized traffic out: by restricting who can talk to your mesh, and by firewalling the Yggdrasil interface. Both can be used, and they can be used simultaneously. I ll discuss firewalling more at the end of this article. Basically, you ll almost certainly want to do this if you participate in the public mesh, because doing so is akin to having a globally-routable public IP address direct to your device. If you want to restrict who can talk to your mesh, you just disable the broadcast feature on all your nodes (empty `MulticastInterfaces` section in the config), and avoid telling any of your nodes to connect to a public peer. You can set a list of authorized public keys that can connect to your nodes listening interfaces, which you ll probably want to do. You will probably want to either open up some inbound ports (if you can) or set up a node with a known clearnet IP on a place like a $5/mo VPS to help with NAT traversal (again, setting `AllowedPublicKeys` as appropriate). Yggdrasil doesn t allow filtering multicast clients by public key, only by network interface, so that s why we disable broadcast discovery. You can easily enough teach Yggdrasil about static internal LAN IPs of your nodes and have things work that way. (Or, set up an internal gateway node or two, that the clients just connect to when they re local). But fundamentally, you need to put a bit more thought into this with Yggdrasil than with the other tools here, which are closed-only. Compared to some of the other tools here, Yggdrasil is better about information leakage; nodes only know details, such as clearnet IPs, of directly-connected peers. You can obtain the list of directly-connected peers of any known node in the mesh but that list is the public keys of the directly-connected peers, not the clearnet IPs. Some of the other tools contain a limited integrated firewall of sorts (with limited ACLs and such). Yggdrasil does not, but is fully compatible with on-host firewalls. I recommend these anyway even with many other tools.

Yggdrasil: Connectivity and NAT traversal Compared to the other tools, Yggdrasil is an interesting mix. It provides a fully functional mesh and facilitates connectivity in situations in which no other tool can. Yet its NAT traversal, while it exists and does work, results in using a broker under some of the more challenging CGNAT situations more often than some of the other tools, which can impede performance. Yggdrasil s underlying protocol is TCP-based. Before you run away screaming that it must be slow and unreliable like OpenVPN over TCP it s not, and it is even surprisingly good around bufferbloat. I ve found its performance to be on par with the other tools here, and it works as well as I d expect even on flaky 4G links. Overall, the NAT traversal story is mixed. On the one hand, you can run a node that listens on port 443 and Yggdrasil can even make it speak TLS (even though that s unnecessary from a security standpoint), so you can likely get out of most restrictive firewalls you will ever encounter. If you join the public mesh, know that plenty of public peers do listen on port 443 (and other well-known ports like 53, plus random high-numbered ones). If you connect your system to multiple public peers, there is a chance though a very small one that some public transit traffic might be routed via it. In practice, public peers hopefully are already peered with each other, preventing this from happening (you can verify this with `yggdrasilctl debug_remotegetpeers key=ABC...`). I have never experienced a problem with this. Also, since latency is a factor in routing for Yggdrasil, it is highly unlikely that random connections we use are going to be competitive with datacenter peers.

Yggdrasil: DNS There is no particular extra DNS in Yggdrasil. You can, of course, run a DNS server within Yggdrasil, just as you can anywhere else. Personally I just add relevant hosts to `/etc/hosts` and leave it at that, but it s up to you.

Yggdrasil: Source code, pricing, and portability Yggdrasil is fully open source (LGPLv3 plus additional permissions in an exception) and highly portable. It is written in Go, and has prebuilt binaries for all major platforms (including a Debian package which I made). There is no charge for anything with Yggdrasil. Listed public peers are free and run by volunteers. You can run your own peers if you like; they can be public and unlisted, public and listed (just submit a PR to get it listed), or private (accepting connections only from certain nodes keys). A peer in this case is just a node with a known clearnet IP address. Yggdrasil encourages use in other projects. For instance, NNCP integrates a Yggdrasil node for easy communication with other NNCP nodes.

Yggdrasil conclusions Yggdrasil is tops in reliability (having no single point of failure) and flexibility. It will maintain opportunistic connections between peers even if the Internet is down. The unique added feature of being able to be part of a global mesh is a nice one. The tradeoffs include being more prone to need to use a broker in restrictive CGNAT environments. Some other tools have clients that override the OS DNS resolver to also provide resolution of hostnames of member nodes; Yggdrasil doesn t, though you can certainly run your own DNS infrastructure over Yggdrasil (or, for that matter, let public DNS servers provide Yggdrasil answers if you wish). There is also a need to pay more attention to firewalling or maintaining separation from the public mesh. However, as I explain below, many other options have potential impacts if the control plane, or your account for it, are compromised, meaning you ought to firewall those, too. Still, it may be a more immediate concern with Yggdrasil. Although Yggdrasil is listed as experimental, I have been using it for over a year and have found it to be rock-solid. They did change how mesh IPs were calculated when moving from 0.3 to 0.4, causing a global renumbering, so just be aware that this is a possibility while it is experimental.

tinc tinc is the oldest tool on this list; version 1.0 came out in 2003! You can think of tinc as something akin to an older Yggdrasil without the public option. I will be discussing tinc 1.0.36, the latest stable version, which came out in 2019. The development branch, 1.1, has been going since 2011 and had its latest release in 2021. The last commit to the Github repo was in June 2022. Tinc is the only tool here to support both tun and tap style interfaces. I go into the difference more in the Zerotier review below. Tinc actually provides a better tap implementation than Zerotier, with various sane options for broadcasts, but I still think the call for an Ethernet, as opposed to IP, VPN is small. To configure tinc, you generate a per-host configuration and then distribute it to every tinc node. It contains a host s public key. Therefore, adding a host to the mesh means distributing its key everywhere; de-authorizing it means removing its key everywhere. This makes it rather unwieldy. tinc can do LAN broadcast discovery and mesh routing, but generally speaking you must manually teach it where to connect initially. Somewhat confusingly, the examples all mention listing a public address for a node. This doesn t make sense for a laptop, and I suspect you d just omit it. I think that address is used for something akin to a Yggdrasil peer with a clearnet IP. Unlike all of the other tools described here, tinc has no tool to inspect the running state of the mesh. Some of the properties of tinc made it clear I was unlikely to adopt it, so this review wasn t as thorough as that of Yggdrasil.

tinc: Security and Privacy As mentioned above, every host in the tinc mesh is authenticated based on its public key. However, to be more precise, this key is validated only at the point it connects to its next hop peer. (To be sure, this is also the same as how the list of allowed pubkeys works in Yggdrasil.) Since IPs in tinc are not derived from their key, and any host can assign itself whatever mesh IP it likes, this implies that a compromised host could impersonate another. It is unclear whether packets are end-to-end encrypted when using a tinc node as a router. The fact that they can be routed at the kernel level by the tun interface implies that they may not be.

tinc: Connectivity and NAT traversal I was unable to find much information about NAT traversal in tinc, other than that it does support it. tinc can run over UDP or TCP and auto-detects which to use, preferring UDP.

tinc: Source code, pricing, and portability tinc is fully open source (GPLv2). It is written in C and generally portable. It supports some very old operating systems. Mobile support is iffy. tinc does not seem to be very actively maintained.

tinc conclusions I haven t mentioned performance in my other reviews (see the section at the end of this post). But, it is so poor as to only run about 300Mbps on my 2.5Gbps network. That s 1/3 the speed of Yggdrasil or Tailscale. Combine that with the unwieldiness of adding hosts and some uncertainties in security, and I m not going to be using tinc.

Automatic Peer-to-Peer Mesh VPNs with centralized control These tend to be the options that are frequently discussed. Let s talk about the options.

Tailscale Tailscale is a popular choice in this type of VPN. To use Tailscale, you first sign up on tailscale.com. Then, you install the tailscale client on each machine. On first run, it prints a URL for you to click on to authorize the client to your mesh ( tailnet ). Tailscale assigns a mesh IP to each system. The Tailscale client lets the Tailscale control plane gather IP information about each node, including all detectable public and private clearnet IPs. When you attempt to contact a node via Tailscale, the client will fetch the known contact information from the control plane and attempt to establish a link. If it can contact over the local LAN, it will (it doesn t have broadcast autodetection like Yggdrasil; the information must come from the control plane). Otherwise, it will try various NAT traversal options. If all else fails, it will use a broker to relay traffic; Tailscale calls a broker a DERP relay server. Unlike Yggdrasil, a Tailscale node never relays traffic for another; all connections are either direct P2P or via a broker. Tailscale, like several others, is based around Wireguard; though wireguard-go rather than the in-kernel Wireguard. Tailscale has a number of somewhat unique features in this space:

Funnel, which lets you expose ports on your system to the public Internet via the VPN.

Exit nodes, which automate the process of routing your public Internet traffic over some other node in the network. This is possible with every tool mentioned here, but Tailscale makes switching it on or off a couple of quick commands away.

Node sharing, which lets you share a subset of your network with guests

A fantastic set of documentation, easily the best of the bunch.

Funnel, in particular, is interesting. With a couple of tailscale serve -style commands, you can expose a directory tree (or a development webserver) to the world. Tailscale gives you a public hostname, obtains a cert for it, and proxies inbound traffic to you. This is subject to some unspecified bandwidth limits, and you can only choose from three public ports, so it s not really a production solution but as a quick and easy way to demonstrate something cool to a friend, it s a neat feature.

Tailscale: Security and Privacy With Tailscale, as with the other tools in this category, one of the main threats to consider is the control plane. What are the consequences of a compromise of Tailscale s control plane, or of the credentials you use to access it? Let s begin with the credentials used to access it. Tailscale operates no identity system itself, instead relying on third parties. For individuals, this means Google, Github, or Microsoft accounts; Okta and other SAML and similar identity providers are also supported, but this runs into complexity and expense that most individuals aren t wanting to take on. Unfortunately, all three of those types of accounts often have saved auth tokens in a browser. Personally I would rather have a separate, very secure, login. If a person does compromise your account or the Tailscale servers themselves, they can t directly eavesdrop on your traffic because it is end-to-end encrypted. However, assuming an attacker obtains access to your account, they could:

Tamper with your Tailscale ACLs, permitting new actions

Add new nodes to the network

Forcibly remove nodes from the network

Enable or disable optional features

Of note is that they cannot just commandeer an existing IP. I would say the riskiest possibility here is that could add new nodes to the mesh. Because they could also tamper with your ACLs, they could then proceed to attempt to access all your internal services. They could even turn on service collection and have Tailscale tell them what and where all the services are. Therefore, as with other tools, I recommend a local firewall on each machine with Tailscale. More on that below. Tailscale has a new alpha feature called tailnet lock which helps with this problem. It requires existing nodes in the mesh to sign a request for a new node to join. Although this doesn t address ACL tampering and some of the other things, it does represent a significant help with the most significant concern. However, tailnet lock is in alpha, only available on the Enterprise plan, and has a waitlist, so I have been unable to test it. Any Tailscale node can request the IP addresses belonging to any other Tailscale node. The Tailscale control plane captures, and exposes to you, this information about every node in your network: the OS hostname, IP addresses and port numbers, operating system, creation date, last seen timestamp, and NAT traversal parameters. You can optionally enable service data capture as well, which sends data about open ports on each node to the control plane. Tailscale likes to highlight their key expiry and rotation feature. By default, all keys expire after 180 days, and traffic to and from the expired node will be interrupted until they are renewed (basically, you re-login with your provider and do a renew operation). Unfortunately, the only mention I can see of warning of impeding expiration is in the Windows client, and even there you need to edit a registry key to get the warning more than the default 24 hours in advance. In short, it seems likely to cut off communications when it s most important. You can disable key expiry on a per-node basis in the admin console web interface, and I mostly do, due to not wanting to lose connectivity at an inopportune time.

Tailscale: Connectivity and NAT traversal When thinking about reliability, the primary consideration here is being able to reach the Tailscale control plane. While it is possible in limited circumstances to reach nodes without the Tailscale control plane, it is a fairly brittle setup and notably will not survive a client restart. So if you use Tailscale to reach other nodes on your LAN, that won t work unless your Internet is up and the control plane is reachable. Assuming your Internet is up and Tailscale s infrastructure is up, there is little to be concerned with. Your own comfort level with cloud providers and your Internet should guide you here. Tailscale wrote a fantastic article about NAT traversal and they, predictably, do very well with it. Tailscale prefers UDP but falls back to TCP if needed. Broker (DERP) servers step in as a last resort, and Tailscale clients automatically select the best ones. I m not aware of anything that is more successful with NAT traversal than Tailscale. This maximizes the situations in which a direct P2P connection can be used without a broker. I have found Tailscale to be a bit slow to notice changes in network topography compared to Yggdrasil, and sometimes needs a kick in the form of restarting the client process to re-establish communications after a network change. However, it s possible (maybe even probable) that if I d waited a bit longer, it would have sorted this all out.

Tailscale: DNS Tailscale s DNS is called MagicDNS. It runs as a layer atop your standard DNS taking over `/etc/resolv.conf` on Linux and provides resolution of mesh hostnames and some other features. This is a concept that is pretty slick. It also is a bit flaky on Linux; dueling programs want to write to `/etc/resolv.conf`. I can t really say this is entirely Tailscale s fault; they document the problem and some workarounds. I would love to be able to add custom records to this service; for instance, to override the public IP for a service to use the in-mesh IP. Unfortunately, that s not yet possible. However, MagicDNS can query existing nameservers for certain domains in a split DNS setup.

Tailscale: Source code, pricing, and portability Tailscale is almost fully open source and the client is highly portable. The client is open source (BSD 3-clause) on open source platforms, and closed source on closed source platforms. The DERP servers are open source. The coordination server is closed source, although there is an open source coordination server called Headscale (also BSD 3-clause) made available with Tailscale s blessing and informal support. It supports most, but not all, features in the Tailscale coordination server. Tailscale s pricing (which does not apply when using Headscale) provides a free plan for 1 user with up to 20 devices. A Personal Pro plan expands that to 100 devices for $48 per year - not a bad deal at $4/mo. A Community on Github plan also exists, and then there are more business-oriented plans as well. See the pricing page for details. As a small note, I appreciated Tailscale s install script. It properly added Tailscale s apt key in a way that it can only be used to authenticate the Tailscale repo, rather than as a systemwide authenticator. This is a nice touch and speaks well of their developers.

Tailscale conclusions Tailscale is tops in sharing and has a broad feature set and excellent documentation. Like other solutions with a centralized control plane, device communications can stop working if the control plane is unreachable, and the threat model of the control plane should be carefully considered.

Zerotier Zerotier is a close competitor to Tailscale, and is similar to it in a lot of ways. So rather than duplicate all of the Tailscale information here, I m mainly going to describe how it differs from Tailscale. The primary difference between the two is that Zerotier emulates an Ethernet network via a Linux tap interface, while Tailscale emulates a TCP/IP network via a Linux tun interface. However, Zerotier has a number of things that make it be a somewhat imperfect Ethernet emulator. For one, it has a problem with broadcast amplification; the machine sending the broadcast sends it to all the other nodes that should receive it (up to a set maximum). I wouldn t want to have a lot of programs broadcasting on a slow link. While in theory this could let you run Netware or DECNet across Zerotier, I m not really convinced there s much call for that these days, and Zerotier is clearly IP-focused as it allocates IP addresses and such anyhow. Zerotier provides special support for emulated ARP (IPv4) and NDP (IPv6). While you could theoretically run Zerotier as a bridge, this eliminates the zero trust principle, and Tailscale supports subnet routers, which provide much of the same feature set anyhow. A somewhat obscure feature, but possibly useful, is Zerotier s built-in support for multipath WAN for the public interface. This actually lets you do a somewhat basic kind of channel bonding for WAN.

Zerotier: Security and Privacy The picture here is similar to Tailscale, with the difference that you can create a Zerotier-local account rather than relying on cloud authentication. I was unable to find as much detail about Zerotier as I could about Tailscale - notably I couldn t find anything about how sticky an IP address is. However, the configuration screen lets me delete a node and assign additional arbitrary IPs within a subnet to other nodes, so I think the assumption here is that if your Zerotier account (or the Zerotier control plane) is compromised, an attacker could remove a legit device, add a malicious one, and assign the previous IP of the legit device to the malicious one. I m not sure how to mitigate against that risk, as firewalling specific IPs is ineffective if an attacker can simply take them over. Zerotier also lacks anything akin to Tailnet Lock. For this reason, I didn t proceed much further in my Zerotier evaluation.

Zerotier: Connectivity and NAT traversal Like Tailscale, Zerotier has NAT traversal with STUN. However, it looks like it s more limited than Tailscale s, and in particular is incompatible with double NAT that is often seen these days. Zerotier operates brokers ( root servers ) that can do relaying, including TCP relaying. So you should be able to connect even from hostile networks, but you are less likely to form a P2P connection than with Tailscale.

Zerotier: DNS Unlike Tailscale, Zerotier does not support automatically adding DNS entries for your hosts. Therefore, your options are approximately the same as Yggdrasil, though with the added option of pushing configuration pointing to your own non-Zerotier DNS servers to the client.

Zerotier: Source code, pricing, and portability The client ZeroTier One is available on Github under a custom business source license which prevents you from using it in certain settings. This license would preclude it being included in Debian. Their library, libzt, is available under the same license. The pricing page mentions a community edition for self hosting, but the documentation is sparse and it was difficult to understand what its feature set really is. The free plan lets you have 1 user with up to 25 devices. Paid plans are also available.

Zerotier conclusions Frankly I don t see much reason to use Zerotier. The virtual Ethernet model seems to be a weird hybrid that doesn t bring much value. I m concerned about the implications of a compromise of a user account or the control plane, and it lacks a lot of Tailscale features (MagicDNS and sharing). The only thing it may offer in particular is multipath WAN, but that s esoteric enough and also solvable at other layers that it doesn t seem all that compelling to me. Add to that the strange license and, to me anyhow, I don t see much reason to bother with it.

Netmaker Netmaker is one of the projects that is making noise these days. Netmaker is the only one here that is a wrapper around in-kernel Wireguard, which can make a performance difference when talking to peers on a 1Gbps or faster link. Also, unlike other tools, it has an ingress gateway feature that lets people that don t have the Netmaker client, but do have Wireguard, participate in the VPN. I believe I also saw a reference somewhere to nodes as routers as with Yggdrasil, but I m failing to dig it up now. The project is in a bit of an early state; you can sign up for an upcoming closed beta with a SaaS host, but really you are generally pointed to self-hosting using the code in the github repo. There are community and enterprise editions, but it s not clear how to actually choose. The server has a bunch of components: binary, CoreDNS, database, and web server. It also requires elevated privileges on the host, in addition to a container engine. Contrast that to the single binary that some others provide. It looks like releases are frequent, but sometimes break things, and have a somewhat more laborious upgrade processes than most. I don t want to spend a lot of time managing my mesh. So because of the heavy needs of the server, the upgrades being labor-intensive, it taking over iptables and such on the server, I didn t proceed with a more in-depth evaluation of Netmaker. It has a lot of promise, but for me, it doesn t seem to be in a state that will meet my needs yet.

Nebula Nebula is an interesting mesh project that originated within Slack, seems to still be primarily sponsored by Slack, but is also being developed by Defined Networking (though their product looks early right now). Unlike the other tools in this section, Nebula doesn t have a web interface at all. Defined Networking looks likely to provide something of a SaaS service, but for now, you will need to run a broker ( lighthouse ) yourself; perhaps on a $5/mo VPS. Due to the poor firewall traversal properties, I didn t do a full evaluation of Nebula, but it still has a very interesting design.

Nebula: Security and Privacy Since Nebula lacks a traditional control plane, the root of trust in Nebula is a CA (certificate authority). The documentation gives this example of setting it up:

./nebula-cert sign -name "lighthouse1" -ip "192.168.100.1/24"
./nebula-cert sign -name "laptop" -ip "192.168.100.2/24" -groups "laptop,home,ssh"
./nebula-cert sign -name "server1" -ip "192.168.100.9/24" -groups "servers"
./nebula-cert sign -name "host3" -ip "192.168.100.10/24"

So the cert contains your IP, hostname, and group allocation. Each host in the mesh gets your CA certificate, and the per-host cert and key generated from each of these steps. This leads to a really nice security model. Your CA is the gatekeeper to what is trusted in your mesh. You can even have it airgapped or something to make it exceptionally difficult to breach the perimeter. Nebula contains an integrated firewall. Because the ability to keep out unwanted nodes is so strong, I would say this may be the one mesh VPN you might consider using without bothering with an additional on-host firewall. You can define static mappings from a Nebula mesh IP to a clearnet IP. I haven t found information on this, but theoretically if NAT traversal isn t required, these static mappings may allow Nebula nodes to reach each other even if Internet is down. I don t know if this is truly the case, however.

Nebula: Connectivity and NAT traversal This is a weak point of Nebula. Nebula sends all traffic over a single UDP port; there is no provision for using TCP. This is an issue at certain hotel and other public networks which open only TCP egress ports 80 and 443. I couldn t find a lot of detail on what Nebula s NAT traversal is capable of, but according to a certain Github issue, this has been a sore spot for years and isn t as capable as Tailscale. You can designate nodes in Nebula as brokers (relays). The concept is the same as Yggdrasil, but it s less versatile. You have to manually designate what relay to use. It s unclear to me what happens if different nodes designate different relays. Keep in mind that this always happens over a UDP port.

Nebula: DNS Nebula has experimental DNS support. In contrast with Tailscale, which has an internal DNS server on every node, Nebula only runs a DNS server on a lighthouse. This means that it can t forward requests to a DNS server that s upstream for your laptop s particular current location. Actually, Nebula s DNS server doesn t forward at all. It also doesn t resolve its own name. The Nebula documentation makes reference to using multiple lighthouses, which you may want to do for DNS redundancy or performance, but it s unclear to me if this would make each lighthouse form a complete picture of the network.

Nebula: Source code, pricing, and portability Nebula is fully open source (MIT). It consists of a single Go binary and configuration. It is fairly portable.

Nebula conclusions I am attracted to Nebula s unique security model. I would probably be more seriously considering it if not for the lack of support for TCP and poor general NAT traversal properties. Its datacenter connectivity heritage does show through.

Roll your own and hybrid Here is a grab bag of ideas:

Running Yggdrasil over Tailscale One possibility would be to use Tailscale for its superior NAT traversal, then allow Yggdrasil to run over it. (You will need a firewall to prevent Tailscale from trying to run over Yggdrasil at the same time!) This creates a closed network with all the benefits of Yggdrasil, yet getting the NAT traversal from Tailscale. Drawbacks might be the overhead of the double encryption and double encapsulation. A good Yggdrasil peer may wind up being faster than this anyhow.

Public VPN provider for NAT traversal A public VPN provider such as Mullvad will often offer incoming port forwarding and nodes in many cities. This could be an attractive way to solve a bunch of NAT traversal problems: just use one of those services to get you an incoming port, and run whatever you like over that. Be aware that a number of public VPN clients have a kill switch to prevent any traffic from egressing without using the VPN; see, for instance, Mullvad s. You ll need to disable this if you are running a mesh atop it.

Other

Combining with local firewalls For most of these tools, I recommend using a local firewal in conjunction with them. I have been using firehol and find it to be quite nice. This means you don t have to trust the mesh, the control plane, or whatever. The catch is that you do need your mesh VPN to provide strong association between IP address and node. Most, but not all, do.

Performance I tested some of these for performance using iperf3 on a 2.5Gbps LAN. Here are the results. All speeds are in Mbps.

Tool iperf3 (default) iperf3 -P 10 iperf3 -R

Direct (no VPN) 2406 2406 2764

Wireguard (kernel) 1515 1566 2027

Yggdrasil 892 1126 1105

Tailscale 950 1034 1085

Tinc 296 300 277

You can see that Wireguard was significantly faster than the other options. Tailscale and Yggdrasil were roughly comparable, and Tinc was terrible.

Tool	iperf3 (default)	iperf3 -P 10	iperf3 -R
Direct (no VPN)	2406	2406	2764
Wireguard (kernel)	1515	1566	2027
Yggdrasil	892	1126	1105
Tailscale	950	1034	1085
Tinc	296	300	277

IP collisions When you are communicating over a network such as these, you need to trust that the IP address you are communicating with belongs to the system you think it does. This protects against two malicious actor scenarios:

Someone compromises one machine on your mesh and reconfigures it to impersonate a more important one

Someone connects an unauthorized system to the mesh, taking over a trusted IP, and uses the privileges of the trusted IP to access resources

To summarize the state of play as highlighted in the reviews above:

Yggdrasil derives IPv6 addresses from a public key

tinc allows any node to set any IP

Tailscale IPs aren t user-assignable, but the assignment algorithm is unknown

Zerotier allows any IP to be allocated to any node at the control plane

I don t know what Netmaker does

Nebula IPs are baked into the cert and signed by the CA, but I haven t verified the enforcement algorithm

So this discussion really only applies to Yggdrasil and Tailscale. tinc and Zerotier lack detailed IP security, while Nebula expects IP allocations to be handled outside of the tool and baked into the certs (therefore enforcing rigidity at that level). So the question for Yggdrasil and Tailscale is: how easy is it to commandeer a trusted IP? Yggdrasil has a brief discussion of this. In short, Yggdrasil offers you both a dedicated IP and a rarely-used /64 prefix which you can delegate to other machines on your LAN. Obviously by taking the dedicated IP, a lot more bits are available for the hash of the node s public key, making collisions technically impractical, if not outright impossible. However, if you use the /64 prefix, a collision may be more possible. Yggdrasil s hashing algorithm includes some optimizations to make this more difficult. Yggdrasil includes a `genkeys` tool that uses more CPU cycles to generate keys that are maximally difficult to collide with. Tailscale doesn t document their IP assignment algorithm, but I think it is safe to say that the larger subnet you use, the better. If you try to use a /24 for your mesh, it is certainly conceivable that an attacker could remove your trusted node, then just manually add the 240 or so machines it would take to get that IP reassigned. It might be a good idea to use a purely IPv6 mesh with Tailscale to minimize this problem as well. So, I think the risk is low in the default configurations of both Yggdrasil and Tailscale (certainly lower than with tinc or Zerotier). You can drive the risk even lower with both.

Final thoughts For my own purposes, I suspect I will remain with Yggdrasil in some fashion. Maybe I will just take the small performance hit that using a relay node implies. Or perhaps I will get clever and use an incoming VPN port forward or go over Tailscale. Tailscale was the other option that seemed most interesting. However, living in a region with Internet that goes down more often than I d like, I would like to just be able to send as much traffic over a mesh as possible, trusting that if the LAN is up, the mesh is up. I have one thing that really benefits from performance in excess of Yggdrasil or Tailscale: NFS. That s between two machines that never leave my LAN, so I will probably just set up a direct Wireguard link between them. Heck of a lot easier than trying to do Kerberos! Finally, I wrote this intending to be useful. I dealt with a lot of complexity and under-documentation, so it s possible I got something wrong somewhere. Please let me know if you find any errors.
This blog post is a copy of a page on my website. That page may be periodically updated.

I've been pondering what should happen to personal websites once the owner has passed, or otherwise "moved on". Some time ago I stumbled across a blog by Kev Quirk, who wrote

I d need to come up with contingency plans for...This website, and any other websites and own/manage

I thought it was a strange idea, to have a contingency plan for a personal site to survive its writer. Even a site such as my own, which (as sites go) would be trivial to host/mirror (since it's static), who would want to do that? Why would they do that? On the other hand, whether something I've written is useful or not is largely independent of whether I'm alive and well. And if anything I've written is useful independently from me (to pick a recent example, my notes on imaging optical media), should it persist somewhere? I've long thought of personal sites much like any website, which is to say, implicitly eternal, despite mountains of evidence that practically the reverse is true. I was influenced a long time ago by the article Cool URIs don't change. Thinking more about "digital legacy" led me to start believing that personal sites are ultimately not the right place for any kind of content that should have some persistence. (You might well have started with that assumption!) Yesterday via Planet Debian I read Aur lien Jarno's recent blog post where they have deprecated all of their personal site bar their blog. Quoting Aur lien:

Wikipedia is a much better platform for sharing knowledge than random websites.

This struck me as an interesting idea. Wikipedia is a reasonable place for a lot of material that might otherwise be hosted on personal sites, and as a "living website", content can be adapted, corrected, etc. over time. Wikipedia is clearly not the right place for much other material that might exist on personal sites (it's not the right place for my aforementioned article). I'm not sure where might be. Perhaps we, as a culture, need to move away from the notion of personal sites, and devise a more collective concept. Or perhaps it's no bad thing that, without intervention, this stuff disappears.

Review: The Last Hero, by Terry Pratchett

Illustrator:	Paul Kidby
Series:	Discworld #27
Publisher:	Harper
Copyright:	2001, 2002
ISBN:	0-06-050777-2
Format:	Graphic novel
Pages:	176

The Last Hero is the 27th Discworld novel and part of the Rincewind subseries. This is something of a sequel to Interesting Times and relies heavily on the cast that was built up in previous books. It's not a good place to start with the series. At last, the rare Rincewind novel that I enjoyed. It helps that Rincewind is mostly along for the ride. Cohen the Barbarian and his band of elderly heroes have decided they're tired of enjoying their spoils and are going on a final adventure. They're going to return fire to the gods, in the form of a giant bomb. The wizards in Ankh-Morpork get wind of this and realize that an explosion at the Hub where the gods live could disrupt the magical field of the entire Disc, effectively destroying it. The only hope seems to be to reach Cori Celesti before Cohen and head him off, but Cohen is already almost there. Enter Lord Vetinari, who has Leonard of Quirm design a machine that will get them there in time by slingshotting under the Disc itself. First off, let me say how much I love the idea of returning fire to the gods with interest. I kind of wish Pratchett had done more with their motivations, but I was laughing about that through the whole book. Second, this is the first of the illustrated Discworld books that I've read in the intended illustrated form (I read the paperback version of Eric), and this book is gorgeous. I enjoyed Paul Kidby's art far more than I had expected to. His style what I will call, for lack of better terminology due to my woeful art education, "highly detailed caricature." That's not normally a style that clicks with me, but it works incredibly well for Discworld. The Last Hero is richly illustrated, with some amount of art, if only subtle background behind the text, on nearly every page. There are several two-page spreads, but oddly I thought those (including the parody of The Scream on the cover) were the worst art of the book. None of them did much for me. The best art is in the figure studies and subtle details: Leonard of Quirm's beautiful calligraphy, his numerous sketches, the labeled illustration of the controls of the flying machine, and the portraits of Cohen's band and the people they encounter. The edition I got is printed on lovely, thick glossy paper, and the subtle art texture behind the writing makes this book a delight to read. I'm not sure if, like Eric, this book comes in other editions, but if so, I highly recommend getting or finding the high-quality illustrated edition for the best reading experience. The plot, like a lot of the Rincewind books, doesn't amount to much, but I enjoyed the mission to intercept Cohen. Leonard of Quirm is a great character, and the slow revelation of his flying machine design (which I will not spoil) is a delightful combination of Leonardo da Vinci parody, Discworld craziness, and NASA homage. Also, the Librarian is involved, which always improves a Discworld book. (The Luggage, sadly, is not; I would have liked to have seen a richly-illustrated story about it, but it looks like I'll have to find the illustrated version of Eric for that.) There is one of Pratchett's philosophical subtexts here, about heroes and stories and what it means for your story to live on. To be honest, it didn't grab me; it's mostly subtext, and this particular set of characters weren't quite introspective enough to make the philosophy central to the story. Also, I was perhaps too sympathetic to Cohen's goals, and thus not very interested in anyone successfully stopping him. But I had a lot more fun with this one than I usually do with Rincewind books, helped considerably by the illustrations. If you've been skipping Rincewind books in your Discworld read-through and have access to the illustrated edition of The Last Hero, consider making an exception for this one. Followed by The Amazing Maurice and His Educated Rodents in publication order and, thematically, by Unseen Academicals. Rating: 7 out of 10

Welcome to the March 2023 report from the Reproducible Builds project. In these reports we outline the most important things that we have been up to over the past month. As a quick recap, the motivation behind the reproducible builds effort is to ensure no malicious flaws have been introduced during compilation and distributing processes. It does this by ensuring identical results are always generated from a given source, thus allowing multiple third-parties to come to a consensus on whether a build was compromised. If you are interested in contributing to the project, please do visit our Contribute page on our website.

News There was progress towards making the Go programming language reproducible this month, with the overall goal remaining making the Go binaries distributed from Google and by Arch Linux (and others) to be bit-for-bit identical. These changes could become part of the upcoming version 1.21 release of Go. An issue in the Go issue tracker (#57120) is being used to follow and record progress on this.
Arnout Engelen updated our website to add and update reproducibility-related links for NixOS to reproducible.nixos.org. [ ]. In addition, Chris Lamb made some cosmetic changes to our presentations and resources page. [ ][ ]
Intel published a guide on how to reproducibly build their Trust Domain Extensions (TDX) firmware. TDX here refers to an Intel technology that combines their existing virtual machine and memory encryption technology with a new kind of virtual machine guest called a Trust Domain. This runs the CPU in a mode that protects the confidentiality of its memory contents and its state from any other software.
A reproducibility-related bug from early 2020 in the GNU GCC compiler as been fixed. The issues was that if GCC was invoked via the `as` frontend, the `-ffile-prefix-map` was being ignored. We were tracking this in Debian via the `build_path_captured_in_assembly_objects` issue. It has now been fixed and will be reflected in GCC version 13.
Holger Levsen will present at foss-north 2023 in April of this year in Gothenburg, Sweden on the topic of Reproducible Builds, the first ten years.
Anthony Andreoli, Anis Lounis, Mourad Debbabi and Aiman Hanna of the Security Research Centre at Concordia University, Montreal published a paper this month entitled On the prevalence of software supply chain attacks: Empirical study and investigative framework:
Software Supply Chain Attacks (SSCAs) typically compromise hosts through trusted but infected software. The intent of this paper is twofold: First, we present an empirical study of the most prominent software supply chain attacks and their characteristics. Second, we propose an investigative framework for identifying, expressing, and evaluating characteristic behaviours of newfound attacks for mitigation and future defense purposes. We hypothesize that these behaviours are statistically malicious, existed in the past, and thus could have been thwarted in modernity through their cementation x-years ago. [ ]

On our mailing list this month:

Mattia Rizzolo is asking everyone in the community to save the date for the 2023 s Reproducible Builds summit which will take place between October 31st and November 2nd at Dock Europe in Hamburg, Germany. Separate announcement(s) to follow. [ ]

ahojlm posted an message announcing a new project which is the first project offering bootstrappable and verifiable builds without any binary seeds. That is to say, a way of providing a verifiable path towards trusted software development platform without relying on pre-provided binary code in order to prevent against various forms of compiler backdoors. The project s homepage is hosted on Tor (mirror).

The minutes and logs from our March 2023 IRC meeting have been published. In case you missed this one, our next IRC meeting will take place on Tuesday 25th April at 15:00 UTC on `#reproducible-builds` on the OFTC network.
and as a Valentines Day present, Holger Levsen wrote on his blog on 14th February to express his thanks to OSUOSL for their continuous support of reproducible-builds.org. [ ]

Debian Vagrant Cascadian developed an easier setup for testing debian packages which uses sbuild s unshare mode along and reprotest, our tool for building the same source code twice in different environments and then checking the binaries produced by each build for any differences. [ ]
Over 30 reviews of Debian packages were added, 14 were updated and 7 were removed this month, all adding to our knowledge about identified issues. A number of issues were updated, including the Holger Levsen updating `build_path_captured_in_assembly_objects` to note that it has been fixed for GCC 13 [ ] and Vagrant Cascadian added new issues to mark packages where the build path is being captured via the Rust toolchain [ ] as well as new categorisation for where virtual packages have nondeterministic versioned dependencies [ ].

Upstream patches The Reproducible Builds project detects, dissects and attempts to fix as many currently-unreproducible packages as possible. We endeavour to send all of our patches upstream where appropriate. This month, we wrote a large number of such patches, including:

Bernhard M. Wiedemann:

`cockpit` (gzip mtime)

`crmsh` (by mcepl: rewrite to avoid python toolchain issue)

`cx_Freeze` (merged, FTBFS-2038)

`golangci-lint` (date)

`guestfs-tools` (gzip mtime)

`perf` (merged, sort python scandir)

`perl-Date-Calc-XS` (FTBFS-2038)

`perl-Date-Calc` (FTBFS-2038)

`pw3270` (merged, date)

`python-dtaidistance` (drop unreproducible unnecessary file)

`sonic-pi` (FTBFS-2038)

`spack` (parallelism)

`tesseract` (fixed, CPU, -march=native)

Chris Lamb:

#1032409 filed against `esda`.

#1032759 filed against `gle-graphics-manual`.

Stefan Br ns:

`transfig/fig2dev` (also in openSUSE ; date in PDF)

Vagrant Cascadian:

#1033032 filed against `buddy`.

#1033089 filed against `subread`.

#1033663 filed against `linux`.

In addition, Vagrant Cascadian filed a bug with a patch to ensure GNU Modula-2 supports the `SOURCE_DATE_EPOCH` environment variable.

Testing framework The Reproducible Builds project operates a comprehensive testing framework (available at tests.reproducible-builds.org) in order to check packages and other artifacts for reproducibility. In March, the following changes were made by Holger Levsen:

Arch Linux-related changes:

Build Arch packages in `/tmp/archlinux-ci/$SRCPACKAGE` instead of `/tmp/$SRCPACKAGE`. [ ]

Start 2/3 of the builds on the `o1` node, the rest on `o2`. [ ]

Add graphs for Arch Linux (and OpenWrt) builds. [ ]

Toggle Arch-related builders to debug why a specific node overloaded. [ ][ ][ ][ ]

Node health checks:

Detect `SetuptoolsDeprecationWarning` tracebacks in Python builds. [ ]

Detect failures do perform `chdist` calls. [ ][ ]

OSUOSL node migration.

Install `megacli` packages that are needed for hardware RAID. [ ][ ]

Add health check and maintenance jobs for new nodes. [ ]

Add mail config for new nodes. [ ][ ]

Handle a node running in the future correctly. [ ][ ]

Migrate some nodes to Debian bookworm. [ ]

Fix nodes health overview for osuosl3. [ ]

Make sure the `/srv/workspace` directory is owned by by the `jenkins` user. [ ]

Use `.debian.net` names everywhere, except when communicating with the outside world. [ ]

Grant fpierret access to a new node. [ ]

Update documentation. [ ]

Misc migration changes. [ ][ ][ ][ ][ ][ ][ ][ ]

Misc changes:

Enable fail2ban everywhere and monitor it with munin [ ].

Gracefully deal with non-existing Alpine schroots. [ ]

In addition, Roland Clobus is continuing his work on reproducible Debian ISO images:

Add/update openQA configuration [ ], and use the actual timestamp for openQA builds [ ].

Moved adding the user to the `docker` group from the `janitor_setup_worker` script to the (more general) `update_jdn.sh` script. [ ]

Use the (short-term) reproducible source when generating `live-build` images. [ ]

diffoscope development diffoscope is our in-depth and content-aware diff utility. Not only can it locate and diagnose reproducibility issues, it can provide human-readable diffs from many kinds of binary formats as well. This month, Mattia Rizzolo released versions `238`, and Chris Lamb released versions `239` and `240`. Chris Lamb also made the following changes:

Fix compatibility with PyPDF 3.x, and correctly restore test data. [ ]

Rework PDF annotation handling into a separate method. [ ]

In addition, Holger Levsen performed a long-overdue overhaul of the Lintian overrides in the Debian packaging [ ][ ][ ][ ], and Mattia Rizzolo updated the packaging to silence an `include_package_data=True` [ ], fixed the build under Debian bullseye [ ], fixed tool name in a list of tools permitted to be absent during package build tests [ ] and as well as documented sending out an email upon [ ]. In addition, Vagrant Cascadian updated the version of GNU Guix to 238 [ and 239 [ ]. Vagrant also updated reprotest to version 0.7.23. [ ]

Other development work Bernhard M. Wiedemann published another monthly report about reproducibility within openSUSE

If you are interested in contributing to the Reproducible Builds project, please visit our Contribute page on our website. However, you can get in touch with us via:

IRC: `#reproducible-builds` on `irc.oftc.net`.

Twitter: @ReproBuilds

Mailing list: `rb-general@lists.reproducible-builds.org`

(TL;DR: I ve been trying out btrfs in some places instead of ext4, I ve hit absolutely zero issues and there are a few features that make me plan to use it more.) Despite (or perhaps because of) working on storage products for a reasonable chunk of my career I have tended towards a conservative approach to my filesystems. By the time I came to Linux ext2 was well established, the move to ext3 was a logical one (the joys of added journalling for faster recovery after unclean shutdowns) and for a long time my default stack has been MD raid with LVM2 on top and then ext4 as the filesystem. I ve dabbled with other filesystems; I ran XFS for a while on my VDR machine, and also when I had a large tradspool with INN, but never really had a hard requirement for it. I ve ended up adminning a machine that had JFS in the past, largely for historical reasons, but don t really remember any issues (vague recollections of NFS problems but that might just have been NFS being NFS). However. ZFS has gathered itself a significant fan base and that makes me wonder about what it can offer and whether I want that. Firstly, let s be clear that I m never going to run a primary filesystem that isn t part of the mainline kernel. So ZFS itself is out, because I run Linux. So what do I want that I can t get with ext4? Firstly, I d like data checksumming. As storage gets larger there s a bigger chance of silent data corruption and while I have backups of the important stuff that doesn t help if you don t know you need to use them. Secondly, these days I have machines running containers, VMs, or with lots of source checkouts with a reasonable amount of overlap in their data. Disk space has got cheaper, but I d still like to be able to do some sort of deduplication of common blocks. So, I ve been trying out btrfs. When I installed my desktop I went with btrfs for / and /home (I kept /boot as ext4). The thought process was that this was a local machine (so easy access if it all went wrong) and I take regular backups (so if it all went wrong I could recover). That was a year and a half ago and it s been pretty dull; I mostly forget I m running btrfs instead of ext4. This is on a machine that tracks Debian testing, so currently on kernel 6.1 but originally installed with 5.10. So it seems modern btrfs is reasonably stable for a machine that isn t driven especially hard. Good start. The fact I forget what filesystem I m running points to the fact that I m not actually doing anything special here. I get the advantage of data checksumming, but not much else. 2 things spring to mind. Firstly, I don t do snapshots. Given I run testing it might be wiser if I did take a snapshot before every apt-get upgrade, and I have a friend who does just that, but even when I ve run unstable I ve never had a machine get itself into a state that I couldn t recover so I haven t spent time investigating. I note Ubuntu has apt-btrfs-snapshot but it doesn t seem to have any updates for years. The other thing I didn t do when I installed my desktop is take advantage of subvolumes. I m still trying to get my head around exactly what I want them for, but they provide a partial replacement for LVM when it comes to carving up disk space. Instead of the separate / and /home LVs I created I could have created a single LV that would have a single btrfs filesystem on it. / and /home would then be separate subvolumes, allowing me to snapshot each individually. Quotas can also be applied separately so there s still the potential to prevent one subvolume taking all available space. Encouraged by the lack of hassle with my desktop I decided to try moving my sbuild machine over to use btrfs for its build chroots. For Reasons this is a VM kindly hosted by a friend, rather than something local. To be honest these days I would probably go for local hosting, but it works and there s no strong reason to move. The point is it s remote, and so if migrating went wrong and I had to ask for assistance I d be bothering someone who s doing me a favour as it is. The build VM is, of course, running LVM, and there was luckily some free space available. I m reasonably sure the underlying storage involves spinning rust, so I did a laborious set of pvmove commands to make sure all the available space was at the start of the PV, and created a new btrfs volume there. I was advised that while btrfs-convert would do the job it was better to create a fresh filesystem where possible. This time I did create an initial root subvolume. Configuring up sbuild was then much simpler than I d expected. My setup originally started out as a set of tarballs for the chroots that would get untarred + used for the builds, which is pretty slow. Once overlayfs was mature enough I switched to that. I d had a conversation with Enrico about his nspawn/btrfs setup, but it turned out Russ Allbery had written an excellent set of instructions on sbuild with btrfs. I tweaked my existing setup based on his details, and I was in business. Each chroot is a separate subvolume - I don t actually end up having to mount them individually, but it means that only the chroot in use gets snapshotted. For example during a build the following can be observed:

# btrfs subvolume list /
ID 257 gen 111534 top level 5 path root
ID 271 gen 111525 top level 257 path srv/chroot/unstable-amd64-sbuild
ID 275 gen 27873 top level 257 path srv/chroot/bullseye-amd64-sbuild
ID 276 gen 27873 top level 257 path srv/chroot/buster-amd64-sbuild
ID 343 gen 111533 top level 257 path srv/chroot/snapshots/unstable-amd64-sbuild-328059a0-e74b-4d9f-be70-24b59ccba121

I was a little confused about whether I d got something wrong because the snapshot top level is listed as 257 rather than 271, but digging further with btrfs subvolume show on the 2 mounted directories correctly showed the snapshot had a parent equal to the chroot, not /. As a final step I ran jdupes via jdupes -1Br / to deduplicate things across the filesystem. It didn t end up providing a significant saving unfortunately - I guess there s a reasonable amount of change between Debian releases - but I think tried it on my desktop, which tends to have a large number of similar source trees checked out. There I managed to save about 5% on /home, which didn t seem too shabby. The sbuild setup has been in place for a couple of months now, and I ve run quite a few builds on it while preparing for the freeze. So I m fairly confident in the stability of the setup and my next move is to transition my local house server over to btrfs for its containers (which all run under systemd-nspawn). Those are generally running a Debian stable base so there should be a decent amount of commonality for deduping. I m not saying I m yet at the point where I ll default to btrfs on new installs, but I m definitely looking at it for situations where I think I can get benefits from deduplication, or being able to divide up disk space without hard partitioning space. (And, just to answer the worry I had when I started, I ve got nowhere near ENOSPC problems, but I believe they re handled much more gracefully these days. And my experience of ZFS when it got above 90% utilization was far from ideal too.)

Three Ancient Civilizations I have been reading books for a long time but somehow I don t know how or when I realized that there are three medieval civilizations that time and again seem to fascinate the authors, either American or European. The three civilizations that do get mentioned every now and then are Egyptian, Greece and Roman and I have no clue as to why. Another two civilizations closely follow them, Mesopotian and Sumerian Civilizations. Why most of the authors irrespective of genre are mystified by these 5 civilizations is beyond me but they also conjure lot of imagination. Now if I was on the right side of 40, maybe 22-25 onwards and had the means or the opportunity or both, I would have gone and tried to learn as much as I can about these various civilizations. There is still a lot of enigma attached and it seems that the official explanations of how these various civilizations ended seems too good to be true or whatever. One of the more interesting points has been how Greece mythology were subverted and flipped to make Roman mythology. Apparently, many Greek gods have a Roman aspect and the qualities are opposite to the Greek aspect. I haven t learned the reason why it is so, yet. Apart from the Iziko Natural Museum in South Africa which did have its share of wonders (the whole South African experience was surreal) nowhere I have witnessed stuff from the above other than in Africa. The whole thing seemed just surreal. I could go on but as I don t know what is real and what is mythology when it comes to various cultures including whatever little I experienced in Africa.

Recent History Having said the above, I also find that many Indians somehow do not either know or are not interested in understanding recent history. I am talking of the time between 1800s and 1950 s. British apologist Mr. William Dalrymple in his book Anarchy has shared how the East Indian Company looted India and gave less than half back to the British. So where did the rest of the money go ? It basically went to the tax free havens around UK. That tale is very much similar as to how the Axis gold was used to have an entity called Switzerland from scratch. There have been books as well as couple of miniseries or two that document that fact. But for me this is not the real story, this is more of a side dish per-se. The real story perhaps is how the EU peace project was born. Because of Hitler s rise and the way World War 2 happened, the Allies knew that they couldn t subjugate Germany through another humiliation as they had after World War 1, who knows another Hitler might come. So while they did fine the Germans, they also helped them via the Marshall plan. Germany also realized that it had been warring with other European countries for over a thousand years as well as quite a few other countries. The first sort of treaty immediately in that regard was the Western Union Alliance . While on the face of it, it was a defense treaty between sovereign powers. Interestingly, as can be seen UK at that point in time wanted more countries to be part of the Treaty of Dunkirk. This was quickly followed 2 years later by the Treaty of London which made way for the Council of Europe to be formed. The reason I am sharing is because a lot of Indians whom I meet on SM do not know that UK was instrumental front and center for the formation of European Union (EU). Also they perhaps don t realize that after World War 2, UK was greatly diminished both financially and militarily. That is the reason it gave back its erstwhile colonies including India and other countries. If UK hadn t become a part of EU they would have been called the poor man of Europe as they had been called few hundred years ago. In fact, after Brexit, UK has been the only nation that has fallen on the dire times as much as it has. They have practically no food, supermarkets running out of veggies and whatnot. electricity sky-high as they don t want to curtail profits of the gas companies which in turn donate money to Tory coffers. Sadly, most of my brethen do not know that hence I have had to share it. The EU became a power in its own right due to Gravity model of trade. The U.S. is attempting the same thing and calls it near off-shoring.

Tesla, Investor Day Presentation, Toyota and Free Software The above brings it nicely about the Tesla Investor Day that happened couple of weeks back. The biggest news though was broken just 24 hours earlier when the governor of Monterrey] shared how Giga New Mexico would be happening and shared some site photographs. There have been bits of news on that off-and-on since then. Toyota meanwhile has been putting lot of anti-EV posters and whatnot. In fact, India seems to be mirroring Japan in a lot of ways, the only difference is Japan is still superior economically than India. I do sincerely hope that at least Japan can get out of its lost decades. I have asked privately if any of the Japanese translators would be willing to translate the anti-EV poster so we may all know.

Anti-EV poster by Toyota

Urinal outside UK Embassy About a week back few people chanting Khalistan entered the Indian Embassy and showed the flags. This was in the UK. Now in a retaliatory move against a friendly nation, we want to make a Urinal. after making a security downgrade for the Embassy. The pettiness being shown by Indian Govt. especially when it has very few friends. There are also plans to do the same for the U.S. and other embassies as well. I do not know when we will get common sense.

Holi Just a few weeks back, Holi happened. It used to be a sweet and innocent festival. But from the last few years, I have been hearing and seeing sexual harassment on the rise. In fact, saw quite a few lewd posts written to women on Holi or verge of Holi and also videos of the same. One of the most shameful incidents occurred with a Japanese tourist. She was not only sexually molested but also terrorized that if she were to report then she could be raped and murdered. She promptly left Delhi. Once it was reported in mass media, Delhi Police tried to show it was doing something. FWIW, Delhi s crime against women stats have been at record high and conviction at record low.

Ecommerce Rules being changed, logistics gonna be tough. Just today there have been changes in Ecommerce rules (again) and this is gonna be a pain for almost all companies big and small with the exception of Adani and Ambani. Almost all players including the Tatas have called them out. Of course, all such laws have been passed without debate. In such a scenario, small startups like these cannot hope to grow their business.

Inertial Measurement Unit or Full Body Tracking with cheap hardware. Apparently, a whole host of companies are looking at 3-D tracking using cheap hardware apparently known as IMU sensors. Sooner than later cheap 3-D glasses and IMU sensors should explode the 3-D market worldwide. There is a huge potential and upside to it and will probably overtake smartphones as well. But then the danger will be of our thoughts, ideas, nightmares etc. to be shared without our consent. The more we tap into a virtual world, what stops anybody from tapping into our brain and practically stealing our identity in more than one way. I dunno if there are any Debian people or FSF projects working on the above. Even Laws can only do so much, until and unless there are alternative places and ways it would be difficult to say the least

Review: Thief of Time, by Terry Pratchett

Series:	Discworld #26
Publisher:	Harper
Copyright:	May 2001
Printing:	August 2014
ISBN:	0-06-230739-8
Format:	Mass market
Pages:	420

Thief of Time is the 26th Discworld novel and the last Death novel, although he still appears in subsequent books. It's the third book starring Susan Sto Helit, so I don't recommend starting here. Mort is the best starting point for the Death subseries, and Reaper Man provides a useful introduction to the villains. Jeremy Clockson was an orphan raised by the Guild of Clockmakers. He is very good at making clocks. He's not very good at anything else, particularly people, but his clocks are the most accurate in Ankh-Morpork. He is therefore the logical choice to receive a commission by a mysterious noblewoman who wants him to make the most accurate possible clock: a clock that can measure the tick of the universe, one that a fairy tale says had been nearly made before. The commission is followed by a surprise delivery of an Igor, to help with the clock-making. People who live in places with lots of fields become farmers. People who live where there is lots of iron and coal become blacksmiths. And people who live in the mountains near the Hub, near the gods and full of magic, become monks. In the highest valley are the History Monks, founded by Wen the Eternally Surprised. Like most monks, they take apprentices with certain talents and train them in their discipline. But Lobsang Ludd, an orphan discovered in the Thieves Guild in Ankh-Morpork, is proving a challenge. The monks decide to apprentice him to Lu-Tze the sweeper; perhaps that will solve multiple problems at once. Since Hogfather, Susan has moved from being a governess to a schoolteacher. She brings to that job the same firm patience, total disregard for rules that apply to other people, and impressive talent for managing children. She is by far the most popular teacher among the kids, and not only because she transports her class all over the Disc so that they can see things in person. It is a job that she likes and understands, and one that she's quite irate to have interrupted by a summons from her grandfather. But the Auditors are up to something, and Susan may be able to act in ways that Death cannot. This was great. Susan has quickly become one of my favorite Discworld characters, and this time around there is no (or, well, not much) unbelievable romance or permanently queasy god to distract. The clock-making portions of the book quickly start to focus on Igor, who is a delightful perspective through whom to watch events unfold. And the History Monks! The metaphysics of what they are actually doing (which I won't spoil, since discovering it slowly is a delight) is perhaps my favorite bit of Discworld world building to date. I am a sucker for stories that focus on some process that everyone thinks happens automatically and investigate the hidden work behind it. I do want to add a caveat here that the monks are in part a parody of Himalayan Buddhist monasteries, Lu-Tze is rather obviously a parody of Laozi and Daoism in general, and Pratchett's parodies of non-western cultures are rather ham-handed. This is not quite the insulting mess that the Chinese parody in Interesting Times was, but it's heavy on the stereotypes. It does not, thankfully, rely on the stereotypes; the characters are great fun on their own terms, with the perfect (for me) balance of irreverence and thoughtfulness. Lu-Tze refusing to be anything other than a sweeper and being irritatingly casual about all the rules of the order is a classic bit that Pratchett does very well. But I also have the luxury of ignoring stereotypes of a culture that isn't mine, and I think Pratchett is on somewhat thin ice. As one specific example, having Lu-Tze's treasured sayings be a collection of banal aphorisms from a random Ankh-Morpork woman is both hilarious and also arguably rather condescending, and I'm not sure where I landed. It's a spot-on bit of parody of how a lot of people who get very into "eastern religions" sound, but it's also equating the Dao De Jing with advice from the Discworld equivalent of a English housewife. I think the generous reading is that Lu-Tze made the homilies profound by looking at them in an entirely different way than the woman saying them, and that's not completely unlike Daoism and works surprisingly well. But that's reading somewhat against the grain; Pratchett is clearly making fun of philosophical koans, and while anything is fair game for some friendly poking, it still feels a bit weird. That isn't the part of the History Monks that I loved, though. Their actual role in the story doesn't come out of the parody. It's something entirely native to Discworld, and it's an absolute delight. The scene with Lobsang and the procrastinators is perhaps my favorite Discworld set piece to date. Everything about the technology of the History Monks, even the Bond parody, is so good. I grew up reading the Marvel Comics universe, and Thief of Time reminds me of a classic John Byrne or Jim Starlin story, where the heroes are dumped into the middle of vast interdimensional conflicts involving barely-anthropomorphized cosmic powers and the universe is revealed to work in ever more intricate ways at vastly expanding scales. The Auditors are villains in exactly that tradition, and just like the best of those stories, the fulcrum of the plot is questions about what it means to be human, what it means to be alive, and the surprising alliances these non-human powers make with humans or semi-humans. I devoured this kind of story as a kid, and it turns out I still love it. The one complaint I have about the plot is that the best part of this book is the middle, and the end didn't entirely work for me. Ronnie Soak is at his best as a supporting character about three quarters of the way through the book, and I found the ending of his subplot much less interesting. The cosmic confrontation was oddly disappointing, and there's a whole extended sequence involving chocolate that I think was funnier in Pratchett's head than it was in mine. The ending isn't bad, but the middle of this book is my favorite bit of Discworld writing yet, and I wish the story had carried that momentum through to the end. I had so much fun with this book. The Discworld novels are clearly getting better. None of them have yet vaulted into the ranks of my all-time favorite books there's always some lingering quibble or sagging bit but it feels like they've gone from reliably good books to more reliably great books. The acid test is coming, though: the next book is a Rincewind book, which are usually the weak spots. Followed by The Last Hero in publication order. There is no direct thematic sequel. Rating: 8 out of 10

For those who can t see the above poster says the following A handful of retired Supreme Court judges who are part of Anti-India and are trying to make Indian judiciary play role of the Opposition party. Law Minister Kiren Rijiju. Now, just to give bit more of a context, the above has happened as the CJI (Chief Justice has not been listening or toeing their line)

The above is a statement given by CJI DY Chandrachud. He says and I quote Democracy needs truth to survive. Democracy and truth go hand in hand. Speaking truth to power is a right of every citizen in a democracy. It is equally a duty.

Another quote by him. I am personally averse to sealed covers. There has to be transparency in Court. This is about implementing the orders. What can be secrecy here. Now I need to again give context of the various statements given. The law minister who gave this statement, his name incidentally came first under the radar sometime in 2013-2014 in a list given/shared by Union Home Secretary R K Singh (then in UPA, now in BJP) as an anti-India China sympathizer. Of course today all sorts of thugs use the brand nationalism he is one of them. Now as far as the CJI is concerned, AFAIK I know he bent over backwards for them but they were not pleased with him. The latest sealed cover statement is because BJP wanted to put a sealed cover in the ongoing Udhav Thackeray vs Eknath Shinde case where BJP or the Eknath Shinde faction tried to give a sealed cover. The problem with sealed cover is for any defence there is nothing to fight against it. It could be a blank paper, it could be whole lot of gossip or innuendo, unless it s out in the open the prosecution in this case i.e. Udhav Thackeray team could not effectively fight as they do not know the contents of the said sealed cover. This goes against all judicial norms. The U.S. tried with sealed cover and ended the practise only after couple of cases. Even in the OROP case (One Rank One Pension) the Govt. tried the same trick. As I have shared before, this actually comes Senator Joseph Mccarthy who started this whole thing in 1950 s, a bit of background on him. I probably had shared about him before but it bears to know and remember again and again. The same is what Mr. Trump has done time and again. It s a similar script. Bully your opponents and use sealed cover as no questions can be asked about it. The good news though is that the views of CJI have been changing. So, yes they would like to change the CJI, they actually have been trying to have their own man have keys to the judiciary but without success so far, soon though this bastion would also fall, if not today then tomorrow. The EC (Election Commission) has been thoroughly compromised. Would not share more as that EC s bending acts would use a whole blog post or two to share the numerous instances where BJP has been given all rights. In fact, via RTI it came to be known that many of the BJP leaders had given false statements about their education in the EC affidavit. That alone should have been grounds for throwing out the legislators but in their wisdom they see fit to remain blind. The latest case is of Mr. Nishikant Dubey. Of course, even with the legal documents shared in public domain and EC having powers to verify such documents, they are choosing to remain silent. If one has lied about one thing in an affidavit, what or how many lies he has shared in that same document. And how are you supposed to trust anything that comes out of his mouth. This is the state of not just Mr. Dubey but a whole lot of people who are in BJP today. And EC as an institution seems to be let down. There was a time when it was lead by T.N. Seshan who was courageous, fearless and fair. He later went to become Speaker but even then he was harsh but fair. Perhaps people thought we will always have people like him. Hence instead of strong institutions having strong rules, we based our trust on people and hence we are where we are

There is much to share but some other day, Till later.